Starch modification

ABSTRACT

The present invention relates to a method of altering starch synthesis in a plant by modifying the starch priming activity of the plant. In particular, this is achieved by altering the expression or activity of a starch primer which is preferably encoded by the sequence of SEQ ID NO: 1 or a sequence substantially homologous thereto. Also provided are plants in which the starch priming activity has been altered and plant parts.

[0001] This application claims priority to U.S. Provisional PatentApplication No. 60/346,907, filed on Jan. 8, 2002 and Great BritainPatent Application No. 0119342.4, filed on Aug. 8, 2001, both of whichare incorporated by reference herein.

1 FIELD OF INVENTION

[0002] The present invention is based upon the identification of aprotein, which initiates starch synthesis in a plant. In particular, theintention relates to plant glycogenin-like nucleic acid molecules, plantglycogenin-like gene products, antibodies to plant glycogenin-like geneproducts, plant glycogenin-like regulatory regions, vectors andexpression vectors with plant glycogenin-like genes, cells, plants andplant parts with plant glycogenin-like genes, modified starch from suchplants and the use of the foregoing to improve agronomically valuableplants.

2 BACKGROUND

[0003] Starch, a branched polymer of glucose consisting of largelylinear amylose and highly branched amylopectin, is the product of carbonfixation during photosynthesis in plants, and is the primary metabolicenergy reserve stored in seeds and fruit. For example, up to 75% of thedry weight of grain in cereals is made up of starch. The importance ofstarch as a food source is reflected by the fact that two thirds of theworlds food consumption (in terms of calories) is provided by the starchin grain crops such as wheat, rice and maize.

[0004] Starch is the product of photosynthesis, and is analogous to thestorage compound glycogen in eukaryotes. It is produced in thechloroplasts or amyloplasts of plant cells, these being the plastids ofphotosynthetic cells and non-photosynthetic cells, respectively. Thebiochemical pathway leading to the production of starch in leaves hasbeen well characterised, and considerable progress has also been made inelucidating the pathway of starch biosynthesis in storage tissues.

[0005] The biosynthesis of starch molecules is dependent on a complexinteraction of numerous enzymes, including several essential enzymessuch as ADP-Glucose, a series of starch synthases which use ADP glucoseas a substrate for forming chains of glucose linked by alpha-1-4linkages, and a series of starch branching enzymes that link sections ofpolymers with alpha-1-6 linkages to generate branched structures (Smithet al., 1995, Plant Physiology, 107:673-677). Further modification ofthe starch by yet other enzymes, i.e. debranching enzymes ordisproportionating enzymes, can be specific to certain species.

[0006] The fine structure of starch is a complex mixture of D-glucosepolymers that consist essentially of linear chains (amylose) andbranched chains (amylopectin) glucans. Typically, amylose makes upbetween 10 and 25% of plant starch, but varies significantly amongspecies. Amylose is composed of linear D-glucose chains typically250-670 glucose units in length (Tester, 1997, in: Starch Structure andFunctionality, Frazier et al., eds., Royal Society of Chemistry,Cambridge, UK). The linear regions of amylopectin are composed of lowmolecular weight and high molecular weight chains, with the low rangingfrom 5 to 30 glucose units and the high molecular weight chains from 30to 100 or more. The amylose/amylopectin ratio and the distribution oflow and high molecular weight D-glucose chains can affect starch granuleproperties such as gelatinization temperature, retrogradation, andviscosity (Blanshard, 1987). The characteristics of the fine structureof starch mentioned above have been examined at length and are wellknown in the art of starch chemistry.

[0007] It is know that starch granule size and amylose percentage changeduring kernel development in maize and during tobacco leaf development(Boyer et al., 1976, Cereal Chem 53:327-337). In his classic study Boyeret al. concluded the amylose percentage of starch decreases withdecreasing granule size in later stages of maize kernel development.

[0008] As mentioned above, glycogen serves as the glucose reserve inanimals rather than starch. The biosynthesis of glycogen in eukaryotesinvolves chain elongation through the formation of linear alpha-1,4glycosidic linkages catalysed by the enzyme, glycogen synthase. Evidencefor a distinct initiation step involving a self-glucosylating protein,known as glycogenin or SGP, came from work directed at mammalian systems(Smythe et al., Eur. J. Biochem 200:625-631 (1990) and Whelan Bioessays5:136-140 (1986)).

[0009] Cheng et al (Mol. and Cell Biol. 15(12): 6632-6640 (1995)) reportthe identification of two yeast genes whose products are implicated inthe biosynthesis of glycogen. The two genes, Glg1 and Glg2 encodeself-glucosylating proteins which in vitro act as primers for theelongation reaction catalysed by glycogen synthase. Disruption of boththese genes results in the inability to synthesise glycogen, despitenormal levels of glycogen synthase. Glycogenin homologues have beenidentified in Caenorhabditis elegans and humans (Mu et al., J. Biol.Chem. 272(44): 27589-27597(1997)).

[0010] It is now well established that glycogen synthesis is initiatedon the primer protein, glycogenin or SGP, which remains covalentlyattached to the resulting macromolecule. The initiation step is thoughtto involve glycogenin growing a covalently attached oligosaccharideprimer linked via a unique carbohydrate-protein bond via the hydroxylgroup of the Tyr residue, Tyr 194. Once this oligosaccharide chain onglycogenin has been extended sufficiently glycogen synthase is able tocatalyse elongation and together with the branching enzyme, form themature glycogen molecule (Rodriguez and Whelan, Biochem Biophy Res Comm,132:829-836; Roach and Skurat, 1997, in Progress in Nucleic AcidResearch and Molecular Biology p289-316, Academic Press).

[0011] Previous workers have set out to determine whether a primingmolecule, such as a self glucosylating protein, is responsible for theinitiation of starch synthesis in plants. WO94/04693 (Zeneca Ltd.)describes the purification of a putative starch priming protein moleculefrom maize endosperm, known as amylogenin, and isolation of a partialcDNA. The maize amylogenin showed no sequence homology with glycogeninand exhibited a novel glucose-protein bond (Singh et al., FEBS Letters376: 61-64 (1995)). However, based upon the sequence homology and thereported properties of the maize protein, it has subsequently been shownthat the sequence of the maize nucleic acid molecule reported above ishomologous to a reversibly glycosylated polypeptide (RGP 1) from pea(Dhugga et al., Proc. Natl. Acad. Sci. USA 94:7679-7684 (1997)). RGP1 islocalised to the Golgi apparatus and is thought to be involved in cellwall synthesis. This has dispelled the initial idea that the“amylogenin” molecule of WO94/04693 is involved in starch synthesis. Infurther work (Langeveld, M. J. S et al. 2002 Plant Physiol, 129, pp278-289) it is concluded that wheat and rice RGPs do not play a role instarch synthesis in a way similar to the functioning of glycogenin as aprimer for glycogen synthesis. It is reported that RGP1 and RGP2proteins in wheat and rice have different functions to glycogenin.

[0012] Lightner et al. US 2002/0001843 described fragments of putative“corn, wheat, and rice glycogenin and water stress proteins.” Lightneret al. did not demonstrate the functionality of the fragments, but onlytheir sequence homology to glycogenin from animals. To date, therefore,no one has identified and demonstrated a functional protein for starchinitiation or starch priming in plants.

[0013] Purified starch is used in numerous food and industrialapplications and is the major source of carbohydrates in the human diet.Typically, starch is mixed with water and cooked to form a thickeningagent or gel. Of central importance are the temperature at which thestarch cooks, the viscosity that the agent or gel reaches, and thestability of the gel viscosity over time. The physical properties ofunmodified starch limit its usefulness in many applications. As aresult, considerable effort and expenditure is allocated to chemicallymodify starch (i.e. cross-linking and substitution) in order to overcomethe numerous limitations of unmodified starch and to expand industrialusefulness. Modified starches can be used in foods, paper, textiles, andadhesives.

3 SUMMARY OF THE INVENTION

[0014] The invention provides isolated nucleic acids which encompassplant glycogenin-like nucleic acid molecules, plant glycogenin-like geneproducts (including, but not limited to, transcriptional products suchas mRNAs, antisense and ribozyme molecules, and translational productssuch as plant glycogenin-like proteins, polypeptides, peptides andfusion proteins related thereto), antibodies to plant glycogenin-likegene products, plant glycogenin-like regulatory regions, vectors andexpression vectors with plant glycogenin-like genes, cells, plants andplant parts with plant glycogenin-like genes, modified starch from suchplants and the use of the foregoing to improve agronomically valuableplants, including but not limited to maize, wheat, barley and potato.

[0015] The invention is based upon the identification of a proteinresponsible for initiation of starch synthesis in plants, which despitecontinued efforts over the last few years, no one had yet successfullyidentified. In particular, the inventors have discovered nucleic acidmolecules from Arabidopsis which have sequences that are homologous tothe known glycogenin genes of yeast and human. Analysis of one of thisnucleic acid molecule indicates that it contains a sequence encoding atransit peptide for plastid localization of the gene product, consistentwith a role in starch synthesis, referred to herein as plantglycogenin-like starch initiation protein (PGSIP). Glycogenin-like genesfrom other plant species have been identified by analysis of sequencehomology with the Arabidopsis sequences. The genes of the invention donot show homology to the amylogenin sequences or starch sequences of theprior art.

[0016] Modulation of the initiation of starch synthesis allows variousaspects of the biosynthetic process to be regulated. By altering aspectsof the biosynthesis process such as temporal and spatial specificity,yield and storage, the carbohydrate profile of the plant may be alteredin magnitude and directions that may be more favorable for nutritionalor industrial uses.

[0017] The present invention provides for an isolated nucleic acidmolecule that i) comprises a nucleotide sequence which encodes apolypeptide comprising the amino acid sequence of SEQ ID NO: 3, or afragment thereof; ii) comprises a nucleotide sequence at least 40%identical to SEQ ID NOs: 1 or 2, or a complement thereof as determinedusing the BESTFIT or GAP programs with a gap weight of 50 and a lengthweight of 3; or iii) hybridizes to a nucleic acid molecule consisting ofSEQ ID NOs: 1 or 2 under low stringency conditions of hybridizationconsisting of washing at 60° C. for 2×15 minutes at 2×SSC (SodiumCitrate), 0.5% SDS, or a complement thereof. The present invention alsoprovides for an isolated nucleic acid molecule of the inventioncomprising SEQ ID NOs: 1 or 2 or a complement thereof. In an embodimentof the invention, an isolated nucleic acid molecule comprises anucleotide sequence selected from the group consisting of nucleotideresidues 377 to 423, 516 to 592, 1039 to 1655, 1762 to 2536, and 2991 to3264 of SEQ ID NO: 1.

[0018] Another embodiment of the invention encompasses an isolatednucleic acid molecule of the invention that i) comprises a nucleotidesequence which encodes a polypeptide comprising the amino acid sequenceof SEQ ID NO: 11, or a fragment thereof, ii) comprises a nucleotidesequence at least 70% identical to SEQ ID NO: 10, or a complementthereof as determined using the BESTFIT or GAP programs with a gapweight of 50 and a length weight of 3, wherein the nucleotide sequencedoes not encode an amino acid of SEQ ID NO: 35; or iii) hybridizes to anucleic acid molecule consisting of SEQ ID NO: 10 under stringentconditions of hybridization, or a complement thereof, wherein thesequence does not encode an amino acid of SEQ ID NO: 35. In a relatedembodiment, the isolated nucleic acid molecule of the inventioncomprises SEQ ID NO: 10 or a complement thereof. In another relatedembodiment an isolated nucleic acid molecule of the invention comprisesthe amino acid sequence that is at least 98% identical to SEQ ID NO: 9as determined using the BESTFIT or GAP programs with a gap weight of 12and a length weight of 4. The invention also encompasses an isolatednucleic acid molecule that comprises the nucleotide sequence of SEQ IDNO: 8 or a complement thereof.

[0019] In an embodiment of the invention, an isolated nucleic acidmolecule of the invention i) comprises a nucleotide sequence whichencodes a polypeptide comprising the amino acid sequence of SEQ ID NOs:7, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34, or a fragmentthereof; ii) comprises a nucleotide sequence at least 70% identical toSEQ ID NOs: 4, 5, 6, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, 33, or acomplement thereof as determined using the BESTFIT or GAP programs witha gap weight of 50 and a length weight of 3; or iii) hybridizes to anucleic acid molecule consisting of SEQ ID NOs: 4, 5, 6, 12, 14, 16, 18,20, 23, 25, 27, 29, 31, 33 under stringent conditions of hybridization,or a complement thereof. In a related embodiment, the isolated nucleicacid molecule of the invention comprises SEQ ID NOs: 4, 5, 6, 12, 14,16, 18, 20, 23, 25, 27, 29, 31, 33, or a complement thereof. In anotherembodiment of the invention, a fragment of the isolated nucleic acidmolecule of the invention comprises at least 40, 60, 80, 100 or 150contiguous nucleotides of the nucleic acid molecule. In yet anotherembodiment, the isolated nucleic acid molecule of the inventioncomprises the nucleotide sequence of nucleotides 1-195 of SEQ ID NO: 2,or a complement thereof.

[0020] According to one aspect of the invention, an isolated polypeptideof the invention comprises the amino acid sequence of amino acidresidues 1-65 of SEQ ID NO: 3, or a fragment thereof. In a relatedaspect, an isolated polypeptide comprises i) an amino acid sequence thatis at least 70% identical to SEQ ID NO: 3 or a fragment thereof asdetermined using the BESTFIT or GAP programs with a gap weight of 12 anda length weight of 4; ii) an amino acid sequence encoded by the nucleicacid molecule of the invention; or iii) an amino acid sequence of SEQ IDNO: 3.

[0021] An embodiment of the invention encompasses an isolatedpolypeptide of the invention that comprises i) an amino acid sequence atleast 70% identical to SEQ ID NO: 11 as determined using the BESTFIT orGAP programs with a gap weight of 12 and a length weight of 4, or afragment thereof; ii) an amino acid sequence encoded by the nucleic acidmolecule of of the invention; or iii) an amino acid sequence of SEQ IDNO: 11.

[0022] In another embodiment of the invention, an isolated polypeptideof the invention comprises i) an amino acid sequence that is at least98% identical to SEQ ID NO: 9 as determined using the BESTFIT or GAPprograms with a gap weight of 12 and a length weight of 4; iii) an aminoacid sequence encoded by the nucleic acid molecule of SEQ ID NO: 8, or acomplement thereof; or v) an amino acid sequence of SEQ ID NO: 9, or afragment thereof.

[0023] The invention further provides for an isolated polypeptide thatcomprises i) an amino acid sequence that is at least 70% identical toSEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34, or afragment thereof as determined using the BESTFIT or GAP programs with agap weight of 12 and a length weight of 4; ii) an amino acid sequenceencoded by the nucleic acid molecule of the invention; iii) an aminoacid sequence of SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30,32, 34. In an embodiment of the invention, a fragment of a polypeptideof the invention comprises at least 5 amino acid residues, wherein saidfragment is a portion of the polypeptide encoded by a nucleic acidmolecule selected from the group consisting of exon I, exon II, exonIII, exon IV and exon V of SEQ ID NO: 1.

[0024] Another embodiment of the invention encompasses the polypeptideof SEQ ID: 3, 7, 9, 11, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34further comprising one or more conservative amino acid substitution. Inyet another embodiment, the invention provides for a fusion proteincomprising the amino acid sequence of the invention and a heterologousprotein.

[0025] The invention provides for an isolated polypeptide fragment orimmunogenic fragment that comprises at least 5, 8, 10, 15, 20, 25, 30 or35 consecutive amino acids of the polypeptide. The invention furtherprovides for an antibody that immunospecifically binds to a polypeptideof the invention.

[0026] In one embodiment the invention encompasses a method for making apolypeptide of any one of the invention, comprising the steps of a)culturing a cell comprising a recombinant polynucleotide encoding thepolypeptide of the invention under conditions that allow saidpolypeptide to be expressed by said cell; and b) recovering theexpressed polypeptide.

[0027] According to another aspect of the invention, a complexcomprising a polypeptide encoded by a nucleic acid molecule of theinvention and a starch molecule. In one embodiment of the complex of theinvention, the starch molecule comprises from 1 to 700 glucose units. Inanother embodiment of the complex of the invention the starch moleculecomprises branching chains of glucose polysaccharides.

[0028] According to yet another aspect of the invention, a vectorcomprises a nucleic acid molecule of the invention. Alternatively, anexpression vector comprises a nucleic acid molecule of the invention andat least one regulatory region operably linked to the nucleic acidmolecule.

[0029] In another embodiment, the expression vector of the inventioncomprises a regulatory region confers chemically-inducible,dark-inducible, developmentally regulated, developmental-stage specific,wound-induced, environmental factor-regulated, organ-specific,cell-specific, and/or tissue-specific expression of the nucleic acidmolecule or constitutive expression of the nucleic acid molecule of theinvention. In yet another embodiment, the expression vector of theinvention, comprises a regulatory region selected from the groupconsisting of a 35S CaMV promoter, a rice actin promoter, a patatinpromoter, and a high molecular weight glutenin gene of wheat. In anotherembodiment, an expression vector of the invention comprises theantisense sequence of a nucleic acid molecules of the invention, whereinthe antisense sequence is operably linked to at least one regulatoryregion.

[0030] The invention also provides for a genetically-engineered cellwhich comprises a nucleic acid molecule of the invention. In oneembodiment, a cell comprises the expression vector of the inventioncomprising a nucleic acid molecule of the invention and at least oneregulatory region operably linked to the nucleic acid molecule. Inanother embodiment, a cell comprises the expression vector of theinvention comprising the antisense sequence of a nucleic acid moleculesof the invention, wherein the antisense sequence is operably linked toat least one regulatory region.

[0031] In another embodiment of the invention, a cell comprises theexpression vector comprising the antisense sequence of a nucleic acidmolecules of the invention, wherein the antisense sequence is operablylinked to at least one regulatory region

[0032] Yet another aspect of the invention provides for agenetically-engineered plant comprising the isolated nucleic acidmolecule of the invention. The invention also provides for thegenetically-engineered plant comprising the isolated nucleic acidmolecule of the invention and progeny thereof, further comprising atransgene encoding an antisense nucleotide sequence. The invention alsoprovides for the genetically-engineered plant comprising the isolatednucleic acid molecule of the invention, further comprising an RNAinterference construct.

[0033] An embodiment of the invention encompasses a cell comprising an a35SCaMV constitutive promoter operably linked to a nucleic acid moleculeof SEQ ID NO:2 or a rice actin promoter operably linked to an RNAinterference construct comprising fragments of a nucleic acid moleculeof SEQ ID NO:2.

[0034] Another embodiment of the invention provides for a method ofaltering starch synthesis in a plant comprising, introducing into aplant an expression vector of the invention, such that starch synthesisis altered relative to a plant without the expression vector. Yetanother embodiment of the invention provides for a method of alteringstarch synthesis in a plant comprising, introducing into a plant atleast an expression vector comprising the antisense sequence of anucleic acid molecules of the invention, wherein the antisense sequenceis operably linked to at least one regulatory region, such that starchsynthesis is altered in comparison to a plant without the expressionvector.

[0035] In an embodiment of the invention, a method of altering starchgranules in a plant comprises introducing into a plant at least anexpression vector comprising a nucleic acid molecule of the inventionand at least one regulatory region operably linked to the nucleic acidmolecule, such that the starch granules are altered in comparison to aplant without the expression vector.

[0036] In another embodiment of the invention, a method of alteringstarch granules in a plant comprises introducing into a plant at leastan expression vector of claim 30, such that the starch granules arealtered in comparison to a plant without the expression vector.

[0037] In yet another embodiment of the invention, the method ofaltering starch granules in a plant comprises introducing into a plantat least an expression vector comprising a nucleic acid molecule of theinvention and at least one regulatory region operably linked to thenucleic acid molecule, such that the starch granules are absent fromleaves of the plant comprising at least an expression vector.

[0038] In a preferred embodiment of the invention, a plant partcomprises a nucleic acid molecule of the invention resulting in analteration in starch synthesis. In another preferred embodiment theplant part is a tuber, seed, or leaf.

[0039] The invention also provides for the modified starch obtained fromthe plant parts of the invention, wherein the modification is selectedfrom the group consisting of a ratio of amylose to amylopectin, amylosecontent, size of starch granules, quantity of size of starch granules, aratio of small to large starch granules, and rheological properties ofthe starch as measured using viscometric analysis.

3.1 SEQUENCE IDENTIFIERS

[0040] The present invention will now be illustrated by way ofnon-limiting examples of biological sequences in which:

[0041] SEQ ID NO: 1 shows the genomic sequence of a starch primer geneisolated from Arabidopsis thaliana referred to herein as plantglycogenin-like starch initiation protein (PGSIP), at3g18660, GenBankAccession No. NM_(—)112752. The gene includes part of the promoterregion, where the putative TATA and CAAT box are located at nucleotides424-428 and 373-376 respectively. The exons are located at nucleotides377 to 423, 516 to 592, 1039 to 1655, 1762 to 2536, and 2991 to 3264.

[0042] SEQ ID NO: 2 shows the deduced cDNA sequence of Arabidopsisthaliana PGSIP with protein translation. The transit peptide is locatedat nucleotides 1-195.

[0043] SEQ ID NO:3 shows the amino acid sequence representing theArabidopsis thaliana PGSIP protein. The predicted transit peptide islocated at amino acid residues 1-65.

[0044] SEQ ID NO:4 shows the nucleotide sequence of the maize EST ofGenBank Accession No. BF729544 with homology to the Arabidopsis thalianaPGSIP gene. The nucleotide sequence with homology to the Arabidopsisthaliana PGSIP gene is located at nucleotides 1-557.

[0045] SEQ ID NO:5 shows the nucleotide sequence of the maize ESTBG837930 showing homology to Arabidopsis thaliana PGSIP gene. Thenucleotide sequence with homology to the Arabidopsis thaliana PGSIP geneis located at nucleotides 1-726.

[0046] SEQ ID NO:6 shows the deduced cDNA of the Arabidopsisglycogenin-like gene (at1g77130) with protein translation. The proteinsequence with homology to a small region (amino acid residues 1023-1146)of dull1 gene from maize (064923).

[0047] SEQ ID NO:7 shows the amino acid sequence of at1g77130.

[0048] SEQ ID NO:8 shows the deduced cDNA of the Arabidopsisglycogenin-like gene (at1g08990) GenBank Accession No. NM_(—)100770.

[0049] SEQ ID NO:9 shows the amino acid sequence of at1g08990.

[0050] SEQ ID NO:10 shows the deduced cDNA of the Arabidopsisglycogenin-like gene (at1g54940) GenBank Accession No. NM_(—)104367.

[0051] SEQ ID NO:11 shows the amino acid sequence of at1g54940.

[0052] SEQ ID NO: 12 shows the deduced cDNA of the Arabidopsisglycogenin-like gene (at4 g33330) GenBank Accession No. NM_(—)119487.

[0053] SEQ ID NO:13 shows the amino acid sequence of at4 g33330.

[0054] SEQ ID NO: 14 shows the deduced cDNA of the Arabidopsisglycogenin-like gene (at4 g33340) GenBank Accession No. NM_(—)119488.

[0055] SEQ ID NO:15 shows the amino acid sequence of at4 g33340.

[0056] Sequence ID No.16 shows the nucleotide sequence of Barley ESTSeq1.

[0057] SEQ ID NO:17 shows the amino acid sequence of Barley EST Seq1.

[0058] SEQ ID NO:18 shows the nucleotide sequence of Barley EST Seq2.

[0059] SEQ ID NO:19 shows the amino acid sequence of Barley EST Seq2.

[0060] SEQ ID NO:20 shows the nucleotide sequence of a wheat EST.

[0061] SEQ ID NO:21 shows the first half of the amino acid sequence ofthe wheat EST.

[0062] SEQ ID NO:22 shows the second half of the amino acid sequence ofthe wheat EST.

[0063] SEQ ID NO:23 shows the deduced cDNA of the Arabidopsis geneEMBL:AY062695 GenBank Accession No. AY062695 with homology to theArabidopsis PGSIP gene.

[0064] SEQ ID NO:24 shows the amino acid sequence of EMBL:AY062695.

[0065] SEQ ID NO:25 shows the deduced cDNA of the Rice geneSPTrEMBL:Q94HG3 GenBank Accession No. AC079633 with homology to theArabidopsis PGSIP gene.

[0066] SEQ ID NO:26 shows the amino acid sequence of SPTrEMBL:Q94HG3.

[0067] SEQ ID NO:27 shows the nucleotide sequence of Maize EST Seq1.

[0068] SEQ ID NO:28 shows the amino acid sequence of Maize EST Seq1.

[0069] SEQ ID NO:29 shows the nucleotide sequence of Maize EST Seq2.

[0070] SEQ ID NO:30 shows the amino acid sequence of Maize EST Seq2.

[0071] SEQ ID NO:31 shows the nucleotide sequence of Maize EST Seq3.

[0072] SEQ ID NO:32 shows the amino acid sequence of Maize EST Seq3.

[0073] SEQ ID NO:33 shows the nucleotide sequence of Maize EST Seq4.

[0074] SEQ ID NO: 34 shows the amino acid sequence of Maize EST Seq4.

[0075] SEQ ID NO: 35 shows an amino acid sequence as a result of aconceptual translation of a portion of a genomic clone from Arabidopsisthaliana as disclosed in U.S. Patent Application No. 2002/0001843.

4 BRIEF DESCRIPTION OF THE FIGURES

[0076]FIG. 1 shows the plasmid containing the Arabidopsis thaliana plantglycogenin-like starch initiation protein (PGSIP) gene.

[0077]FIG. 2 shows the plasmid map for pTPYES.

[0078]FIG. 3 shows the plasmid map for pNTPYES

[0079]FIG. 4A shows a genomic region containing AT3g18660 (PGSIP); 4Bshows a non-radioactive southern blot of Arabidopsis, wheat and maizegenomic DNA probed with C-terminus AT3g18660 cDNA high stringencyconditions. N-NcoI, A-AvaI, C-ClaI.

[0080]FIG. 5A shows a non-radioactive southern blot of Arabidopsis,wheat and maize genomic DNA probed with N-terminal ATg18660 (PGSIP) cDNAfragment under low Stringent conditions. N-NcoI, A-AvaI, C-ClaI. Lane Mis a marker, lane 1 is AT (EcoRI), lane 2 is AT (XhoI), lane 3 is AT(EcoRV), lane 4 is wheat (EcoRI), lane 5 is wheat (XhoI), lane 6 iswheat EcoRV), lane 7 is maize (Ecor RI), lane 8 is maize (XhoI), andlane 9 is maize (EcoRV); 5B shows a non-radioactive southern blot ofArabidopsis, wheat and maize genomic DNA probed with C-terminal ATg18660(PGSIP) cDNA fragment under low Stringent conditions. N-NcoI, A-AvaI,C-ClaI. Lane M is a marker, lane 1 is AT (EcoRI), lane 2 is AT (XhoI),lane 3 is AT (EcoRV), lane 4 is wheat (EcoRI), lane 5 is wheat (XhoI),lane 6 is wheat EcoRV), lane 7 is maize (Ecor RI), lane 8 is maize(XhoI), and lane 9 is maize (EcoRV).

[0081]FIG. 6 shows the cloning strategy and plasmid maps for theproduction of the PGSIP RNAi construct pCL76 SCV.

[0082]FIG. 7 shows the plasmid map for pCL68 SCV. (Sense expressionconstruct) containing the AT3g18660 (PGSIP) cDNA.

[0083]FIG. 8 shows the plasmid map for pCL76 SCV.(RNAi construct)containing fragments of the AT3g18660 (PGSIP) cDNA.

[0084]FIG. 9 shows the plasmid map for pMC177 (Sense expressionconstruct) containing the AT3g18660 (PGSIP) under rice actin promoterused in barley and Arabidopsis transformation.

[0085]FIG. 10 shows the plasmid map for pMC176 (RNAi construct)containing the AT3g18660 (PGSIP) under rice actin promoter used inbarley and Arabidopsis transformation.

[0086]FIG. 11A shows the results of iodine staining of leaves of barleywhich was shown to be PCR positive for the (pCL76 SCV) RNAi PGSIPconstructs. Starch grains are absent; 11B shows the results of iodinestaining of leaves of barley which was shown to be PCR negative for the(pCL76 SCV) RNAi PGSIP constructs. Starch grains are visible.

5 DETAILED DESCRIPTION OF THE INVENTION

[0087] The invention relates to a family of plant glycogenin-like genes,also referred to as starch primer genes. In various embodiments, theinvention provides plant glycogenin-like nucleic acid moleculesincluding, but not limited to, plant glycogenin-like genes; plantglycogenin-like regulatory regions; plant glycogenin-like promoters; andvectors incorporating sequences encoding plant glycogenin-like nucleicacid molecules of the invention. Also provided are plant glycogenin-likegene products, including, but not limited to, transcriptional productssuch as mRNAs, antisense and ribozyme molecules, and translationalproducts such as the plant glycogenin-like protein, polypeptides,peptides and fusion proteins related thereto; genetically engineeredhost cells that contain any of the foregoing nucleic acid moleculesand/or coding sequences or compliments, variants, or fragments thereofoperatively associated with a regulatory element that directs theexpression of the gene and/or coding sequences in the host cell;genetically-engineered plants derived from host cells; modified starchand starch granules produced by genetically-engineered host cells andplants; and the use of the foregoing to improve agronomically valuableplants.

[0088] In the context of the present invention, a “starch primer” usedinterchangeably with “plant glycogenin-like protein” includes anyprotein which is capable of initiating starch production in a plant. Bydefinition, the plant glycogenin-like protein will be of plant origin.Preferred fragments of plant glycogenin-like proteins are those whichretain the ability to initiate starch synthesis.

[0089] For purposes of clarity, and not by way of limitation, theinvention is described in the subsections below in terms of (a) plantglycogenin-like nucleic acid molecules; (b) plant glycogenin-like geneproducts; (c) transgenic plants that ectopically express plantglycogenin-like protein; (d); transgenic plants in which endogenousplant glycogenin-like protein expression is suppressed; (e) starchcharacterized by altered structure and physical properties produced bythe methods of the invention.

5.1 Plant Glycogenin-Like Nucleic Acids

[0090] The nucleic acid molecules of the invention may be DNA, RNA andcomprises the nucleotide sequences of a plant glycogenin-like gene, orfragments or variants thereof. A polynucleotide is intended to includeDNA molecules (e.g., cDNA, genomic DNA), RNA molecules (e.g., hnRNA,pre-mRNA, mRNA, double-stranded RNA), and DNA or RNA analogs generatedusing nucleotide analogs. The polynucleotide can be single-stranded ordouble-stranded.

[0091] The nucleic acid molecules are characterized by their homology toknown glycogen primer (glycogenin) genes, such as those from yeast (Glg1and Glg2), human (any isoform), C. elegans, rat or rabbit, or plantglycogenin-like gene such as those defined herein. A preferred nucleicacid molecule of this embodiment is one that encodes the amino acidsequence of SEQ ID NO: 2, or a fragment or variant thereof, or a nucleicacid molecule comprising a sequence substantially similar to SEQ ID NO:2. In a most preferred embodiment, the nucleic acid molecule comprisesthe nucleotide sequence shown in SEQ ID NO: 1, or a fragment or variantthereof, or a sequence substantially similar to SEQ ID NO: 1. Thevariants may be an allelic variants. Allelic variants being multipleforms of a particular gene or protein encoded by a particular gene.Fragments of a plant glycogenin-like gene may include regulatoryelements of the gene such as promoters, enhancers, transcription factorbinding sites, and/or segments of a coding sequence for example, aconserved domain, exon, or transit peptide.

[0092] In a preferred embodiment, the nucleic acid molecules of theinvention are comprised of full length sequences in that they encode anentire plant glycogenin-like protein as it occurs in nature. Examples ofsuch sequences include SEQ ID NOs: 1, 2, 6, 8, 10, 12, and 14. Thecorresponding amino acid sequences of full length glycogenin-likeproteins are SEQ ID NOs: 3, 7, 9, 11, 13, and 15. In an alternativeembodiment, the nucleic acid molecules of the invention comprise anucleotide sequence of SEQ ID NOs: 1, 2, 4, 5, 6, 8, 10, 12, 14, 16, 18,20, 23, 25, 27, 29, 31, or 33.

[0093] The nucleic acid molecules and their variants can be identifiedby several approaches including but not limited to analysis of sequencesimilarity and hybridization assays.

[0094] In the context of the present invention the term “substantiallyhomologous,” “substantially identical,” or “substantial similarity,”when used herein with respect to sequences of nucleic acid molecules,means that the sequence has either at least 45% sequence identity withthe reference sequence, preferably 50% sequence identity, morepreferably at least 60%, 70%, 80%, 90% and most preferably at least 95%sequence identity with said sequences, in some cases the sequenceidentity may be 98% or more preferably 99%, or above, or the term meansthat the nucleic acid molecule is either is capable of hybridizing tothe complement of the nucleic acid molecule having the referencesequence under stringent conditions.

[0095] “% identity”, as known in the art, is a measure of therelationship between two polynucleotides or two polypeptides, asdetermined by comparing their sequences. In general, the two sequencesto be compared are aligned to give a maximum correlation between thesequences. The alignment of the two sequences is examined and the numberof positions giving an exact amino acid or nucleotide correspondencebetween the two sequences determined, divided by the total length of thealignment and multiplied by 100 to give a % identity figure. This %identity figure may be determined over the whole length of the sequencesto be compared, which is particularly suitable for sequences of the sameor very similar length and which are highly homologous, or over shorterdefined lengths, which is more suitable for sequences of unequal lengthor which have a lower level of homology.

[0096] For example, sequences can be aligned with the software clustalwunder Unix which generates a file with a “.aln” extension, this file canthen be imported into the Bioedit program (Hall, T. A. 1999. BioEdit: auser-friendly biological sequence alignment editor and analysis programfor Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41:95-98) which opens the.aln file. In the Bioedit window, one can choose individual sequences(two at a time) and alignment them. This method allows for comparison ofthe entire sequences.

[0097] Methods for comparing the identity of two or more sequences arewell known in the art. Thus for instance, programs available in theWisconsin Sequence Analysis Package, version 9.1 (Devereux J et al,Nucleic Acids Res. 12:387-395, 1984, available from Genetics ComputerGroup, Maidson, Wisconsin, USA). The determination of percent identitybetween two sequences can be accomplished using a mathematicalalgorithm. For example, the programs BESTFIT and GAP, may be used todetermine the % identity between two polynucleotides and the % identitybetween two polypeptide sequences. BESTFIT uses the “local homology”algorithm of Smith and Waterman (Advances in Applied Mathematics,2:482-489, 1981) and finds the best single region of similarity betweentwo sequences. BESTFIT is more suited to comparing two polynucleotide ortwo polypeptide sequences which are dissimilar in length, the programassuming that the shorter sequence represents a portion of the longer.In comparison, GAP aligns two sequences finding a “maximum similarity”according to the algorithm of Neddleman and Wunsch (J. Mol. Biol.48:443-354, 1970). GAP is more suited to comparing sequences which areapproximately the same length and an alignment is expected over theentire length. Preferably the parameters “Gap Weight” and “LengthWeight” used in each program are 50 and 3 for polynucleotides and 12 and4 for polypeptides, respectively. Preferably % identities andsimilarities are determined when the two sequences being compared areoptimally aligned.

[0098] Other programs for determining identity and/or similarity betweensequences are also known in the art, for instance the BLAST family ofprograms (Karlin & Altschul, 1990, Proc. Natl. Acad. Sci. USA87:2264-2268, modified as in Karlin & Altschul, 1993, Proc. Natl. Acad.Sci. USA 90:5873-5877, available from the National Center forBiotechnology Information (NCB), Bethesda, Md., USA and accessiblethrough the home page of the NCBI at www.ncbi.nlm.nih.gov). Theseprograms exemplify a preferred, non-limiting example of a mathematicalalgorithm utilized for the comparison of two sequences. Such analgorithm is incorporated into the NBLAST and XBLAST programs ofAltschul, et al., 1990, J. Mol. Biol. 215:403-410. BLAST nucleotidesearches can be performed with the NBLAST program, score=100,wordlength=12 to obtain nucleotide sequences homologous to a nucleicacid molecules of the invention. BLAST protein searches can be performedwith the XBLAST program, score=50, wordlength=3 to obtain amino acidsequences homologous to a protein molecules of the invention. To obtaingapped alignments for comparison purposes, Gapped BLAST can be utilizedas described in Altschul et al., 1997, Nucleic Acids Res. 25:3389-3402.Alternatively, PSI-Blast can be used to perform an iterated search whichdetects distant relationships between molecules (Id.). When utilizingBLAST, Gapped BLAST, and PSI-Blast programs, the default parameters ofthe respective programs (e.g., XBLAST and NBLAST) can be used. Seehttp://www.ncbi.nlm.nih.gov. Another preferred, non-limiting example ofa mathematical algorithm utilized for the comparison of sequences is thealgorithm of Myers and Miller, 1988, CABIOS 4:11-17. Such an algorithmis incorporated into the ALIGN program (version 2.0) which is part ofthe GCG sequence alignment software package. When utilizing the ALIGNprogram for comparing amino acid sequences, a PAM120 weight residuetable, a gap length penalty of 12, and a gap penalty of 4 can be used.

[0099] Another non-limiting example of a program for determiningidentity and/or similarity between sequences known in the art is FASTA(Pearson W. R. and Lipman D. J., Proc. Nat. Acac. Sci., USA,85:2444-2448, 1988, available as part of the Wisconsin Sequence AnalysisPackage). Preferably the BLOSUM62 amino acid substitution matrix(Henikoff S. and Henikoff J. G., Proc. Nat. Acad. Sci., USA,89:10915-10919, 1992) is used in polypeptide sequence comparisonsincluding where nucleotide sequences are first translated into aminoacid sequences before comparison.

[0100] Yet another non-limiting example of a program known in the artfor determining identity and/or similarity between amino acid sequencesis SeqWeb Software (a web-based interface to the GCG Wisconsin Package:Gap program) which is utilized with the default algorithm and parametersettings of the program: blosum 62, gap weight 8, length weight 2.

[0101] The percent identity between two sequences can be determinedusing techniques similar to those described above, with or withoutallowing gaps. In calculating percent identity, typically exact matchesare counted.

[0102] Preferably the program BESTFIT is used to determine the %identity of a query polynucleotide or a polypeptide sequence withrespect to a polynucleotide or a polypeptide sequence of the presentinvention, the query and the reference sequence being optimally alignedand the parameters of the program set at the default value.

[0103] Alternatively, variants and fragments of the nucleic acidmolecules of the invention can be identified by hybridization to SEQ IDNOs: 1, 2, 4-6, 8, 10, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, or 33. Inthe context of the present invention “stringent conditions” are definedas those given in Martin et al (EMBO J. 4:1625-1630 (1985)) and Davieset al (Methods in Molecular Biology Vol 28: Protocols for nucleic acidanalysis by non-radioactive probes, Isaac, P. G. (ed) pp 9-15, HumanaPress Inc., Totowa N.J, USA)).

[0104] The conditions under which hybridization and/or washing can becarried out can range from 42° C. to 68° C. and the washing buffer cancomprise from 0.1×SSC, 0.5% SDS to 6×SSC, 0.5% SDS. Typically,hybridization can be carried out overnight at 65° C. (high stringencyconditions), 60° C. (medium stringency conditions), or 55° C. (lowstringency conditions). The filters can be washed for 2×15 minutes with0.1×SSC, 0.5% SDS at 65° C. (high stringency washing). The filters werewashed for 2×15 minutes with 0.1×SSC, 0.5% SDS at 63° C. (mediumstringency washing). For low stringency washing, the filters were washedat 60° C. for 2×15 minutes at 2×SSC, 0.5% SDS.

[0105] In instances wherein the nucleic acid molecules areoligonucleotides (“oligos”), highly stringent conditions may refer,e.g., to washing in 6×SSC/0.05% sodium pyrophosphate at 37° C. (for14-base oligos), 48° C. (for 17-base oligos), 55° C. (for 20-baseoligos), and 60° C. (for 23-base oligos). These nucleic acid moleculesmay act as plant glycogenin-like gene antisense molecules, useful, forexample, in plant glycogenin-like gene regulation and/or as antisenseprimers in amplification reactions of plant glycogenin-like gene and/ornucleic acid molecules. Further, such nucleic acid molecules may be usedas part of ribozyme and/or triple helix sequences, also useful for plantglycogenin-like gene regulation. Still further, such molecules may beused as components in probing methods whereby the presence of a plantglycogenin-like allele may be detected.

[0106] In one embodiment, a nucleic acid molecule of the invention maybe used to identify other plant glycogenin-like genes by identifyinghomologs. This procedure may be performed using standard techniquesknown in the art, for example screening of a cDNA library by probing;amplification of candidate nucleic acid molecules; complementationanalysis, and yeast two-hybrid system (Fields and Song Nature 340245-246 (1989); Green and Hannah Plant Cell 10 1295-1306 (1998)).

[0107] The invention also includes nucleic acid molecules, preferablyDNA molecules, that are amplified using the polymerase chain reactionand that encode a gene product functionally equivalent to a plantglycogenin-like gene product.

[0108] In another embodiment of the invention, nucleic acid moleculeswhich hybridize under stringent conditions to the nucleic acid moleculescomprising a plant glycogenin-like gene and its complement are used inaltering starch synthesis in a plant. Such nucleic acid molecules mayhybridize to any part of a plant glycogenin-like gene, including theregulatory elements. Preferred nucleic acid molecules are those whichhybridize under stringent conditions to a nucleic acid moleculecomprising the nucleotide sequence encoding the amino acid sequence ofSEQ ID NO: 2, and/or a nucleotide sequence of any one of SEQ ID NOs: 1,2, 4-6, 8, 10, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, or 33 or theircomplement sequences. Preferably the nucleic acid molecule whichhybridizes under stringent conditions to a nucleic acid moleculecomprising the sequence of a plant glycogenin-like gene or itscomplement are complementary to the nucleic acid molecule to which theyhybridize.

[0109] In another embodiment of the invention, nucleic acid moleculeswhich hybridize under stringent conditions to the nucleic acid moleculesof SEQ ID NOs: 1, 2, 4-6, 8, 10, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31,or 33 hybridize over the full length of the sequences of the nucleicacid molecules.

[0110] Alternatively, nucleic acid molecules of the invention or theirexpression products may be used in screening for agents which alter theactivity of a plant glycogenin-like protein of a plant. Such a screenwill typically comprise contacting a putative agent with a nucleic acidmolecule of the invention or expression product thereof and monitoringthe reaction there between. The reaction may be monitored by expressionof a reporter gene operably linked to a nucleic acid molecule of theinvention, or by binding assays which will be known to persons skilledin the art.

[0111] Fragments of a plant glycogenin-like nucleic acid molecule of theinvention preferably comprise or consist of at least 40 continuous orconsecutive nucleotides of the plant glycogenin-like nucleic acidmolecule of the invention, more preferably at least 60 nucleotides, atleast 80 nucleotides, or most preferably at least 100 or 150 nucleotidesin length. Fragments of a plant glycogenin-like nucleic acid molecule ofthe invention encompassed by the invention may include elements involvedin regulating expression of the gene or may encode functional plantglycogenin-like proteins. Fragments of the nucleic acid molecules of theinvention encompasses fragments of SEQ ID NOs: 1, 2, 4-6, 8, 10, 12, 14,16, 18, 20, 23, 25, 27, 29, 31 and 33 as well as fragments of thevariants of those sequences identified as defined above by percenthomology or hybridization assay.

[0112] Examples of fragments encompassed by the invention include exonsof the PGSIP gene. SEQ ID NO: 1 indicates exon and intron boundaries ofthe plant glycogenin-like gene PGSIP. Nucleic acid molecules comprisingPGSIP exon and intron sequences are encompassed by the presentinvention. In one embodiment, five exons are included (SEQ ID NO: 1;GenBank Accession No. NM_(—)112752). PGSIP exon 1 encompassesnucleotides 377 to 423 of SEQ ID NO: 1. of the sequence shown in SEQ IDNO: 1; exon 2 encompasses nucleotides 516 to 592 of the sequence shownin SEQ ID NO:1; exon 3 encompasses nucleotides 1039 to 1655 of thesequence shown in SEQ ID NO: 1; exon 4 encompasses nucleotides 1762 to2536 of the sequence shown in SEQ ID NO: 1; exon 5 encompassesnucleotides 2991 to 3264 of the sequence shown in SEQ ID NO: 1.

[0113] Further, a plant glycogenin-like nucleic acid molecule of theinvention can comprise two or more of any above-described sequences, orvariants thereof, linked together to form a larger subsequence.

[0114] The nucleic acid molecules of the invention can comprise orconsist of an EST sequence. The EST nucleic acid molecules of theinvention can be used as probes for cloning corresponding full lengthgenes. For example, the barley EST of SEQ ID NO: 16 can be utilized as aprobe in identifying and cloning the full length Barley homolog of theArabidopsis PGSIP gene. The EST nucleic acid molecules of the inventionmay be used as sequence probes in connection with computer software tosearch databases, such as GenBank for homologous sequences.Alternatively, the EST nucleic acid molecules can be used as probes inhybridization reactions as described herein. The EST nucleic acidmolecules of the invention can also be used as molecular markers to mapchromosome regions.

[0115] In certain embodiments, the plant glycogenin-like nucleic acidmolecules and polypeptides do not include sequences consisting of thosesequences known in the art. For example, in one embodiment, the plantglycogenin-like nucleic acid molecules do not include EST sequences.

[0116] In other embodiments, the plant glycogenin-like nucleic acidmolecules of the invention, encode polypeptides that function as plantglycogenin-like proteins. The functionality of such nucleic acidmolecules can be assessed using the yeast hybrid complementation assayas described herein in Example 3. Alternatively, the functionality ofsuch nucleic acid molecules can be assessed using a complementationassay in Arabidopsis as described in this section.

[0117] An isolated nucleic acid molecule encoding a variant protein canbe created by introducing one or more nucleotide substitutions,additions or deletions into the plant glycogenin-like nucleic acidmolecule, such that one or more amino acid substitutions, additions ordeletions are introduced into the encoded protein. Mutations can beintroduced by standard techniques, such as, ethyl methane sulfonate,X-rays, gamma rays, T-DNA mutagenesis, or site-directed mutagenesis,PCR-mediated mutagenesis. Briefly, PCR primers are designed that deletethe trinucleotide codon of the amino acid to be changed and replace itwith the trinucleotide codon of the amino acid to be included. Thisprimer is used in the PCR amplification of DNA encoding the protein ofinterest. This fragment is then isolated and inserted into the fulllength cDNA encoding the protein of interest and expressedrecombinantly.

[0118] An isolated nucleic acid molecule encoding a variant protein canbe created by any of the methods described in section 5.1. Eitherconservative or non-conservative amino acid substitutions can be made atone or more amino acid residues. Both conservative and non-conservativesubstitutions can be made. Conservative replacements are those that takeplace within a family of amino acids that are related in their sidechains. Genetically encoded amino acids are can be divided into fourfamilies: (1) acidic=aspartate, glutamate; (2) basic=lysine, arginine,histidine; (3) nonpolar=alanine, valine, leucine, isoleucine, proline,phenylalanine, methionine, tryptophan; and (4) uncharged polar glycine,asparagine, glutamine, cysteine, serine, threonine, tyrosine. In similarfashion, the amino acid repertoire can be grouped as (1)acidic=aspartate, glutamate; (2) basic=lysine, arginine histidine, (3)aliphatic=glycine, alanine, valine, leucine, isoleucine, serine,threonine, with serine and threonine optionally be grouped separately asaliphatic-hydroxyl; (4) aromatic=phenylalanine, tyrosine, tryptophan;(5) amide=asparagine, glutamine; and (6) sulfur-containing=cysteine andmethionine. (See, for example, Biochemistry, 4th ed., Ed. by L. Stryer,W H Freeman and Co.: 1995).

[0119] Alternatively, mutations can be introduced randomly along all orpart of the coding sequence, such as by saturation mutagenesis, and theresultant mutants can be screened for biological activity to identifymutants that retain activity. Following mutagenesis, the encoded proteincan be expressed recombinantly and the activity of the protein can bedetermined.

[0120] The invention also encompasses (a) DNA vectors that contain anyof the foregoing nucleic acids and/or coding sequences (i.e. fragmentsand variants) and/or their complements (i.e., antisense molecules); (b)DNA expression vectors that contain any of the foregoing nucleic acidsand/or coding sequences operatively associated with a regulatory regionthat directs the expression of the nucleic acids and/or codingsequences; and (c) genetically engineered host cells that contain any ofthe foregoing nucleic acids and/or coding sequences operativelyassociated with a regulatory region that directs the expression of thegene and/or coding sequences in the host cell. As used herein,regulatory region include, but are not limited to, inducible andnon-inducible genetic elements known to those skilled in the art thatdrive and regulate expression of a nucleic acid. The nucleic acidmolecules of the invention may be under the control of a promoter,enhancer, operator, cis-acting sequences, or trans-acting factors, orother regulatory sequence. The nucleic acid molecules encodingregulatory regions of the invention may also be functional fragments ofa promoter or enhancer. The nucleic acid molecules encoding a regulatoryregion is preferably one which will target expression to desired cells,tissues, or developmental stages.

[0121] Examples of highly suitable nucleic acid molecules encodingregulatory regions are endosperm specific promoters, such as that of thehigh molecular weight glutenin (HMWG) gene of wheat, prolamin, or ITR1,or other suitable promoters available to the skilled person such asgliadin, branching enzyme, ADFG pyrophosphorylase, patatin, starchsynthase, rice actin, and actin, for example.

[0122] Other suitable promoters include the stem organ specific promotergSPO-A, the seed specific promoters Napin, KTI 1, 2, & 3,beta-conglycinin, beta-phaseolin, heliathin, phytohemaglutinin, legumin,zein, lectin, leghemoglobin c3, AB13, PvAlf, SH-EP, EP-C1, 2S1, EM 1,and ROM2.

[0123] Constitutive promoters, such as CaMV promoters, including CaMV35S and CaMV 19S may also be suitable. Other examples of constitutivepromoters include Actin 1, Ubiquitin 1, and HMG2.

[0124] In addition, the regulatory region of the invention may be onewhich is environmental factor-regulated such as promoters that respondto heat, cold, mechanical stress, light, ultra-violet light, drought,salt and pathogen attack. The regulatory region of the invention mayalso be one which is a hormone-regulated promoter that induces geneexpression in response to phytohormones at different stages of plantgrowth. Useful inducible promoters include, but are not limited to, thepromoters of ribulose bisphosphate carboxylase (RUBISCO) genes,chlorophyll a/b binding protein (CAB) genes, heat shock genes, thedefense responsive gene (e.g., phenylalanine ammonia lyase genes), woundinduced genes (e.g., hydroxyproline rich cell wall protein genes),chemically-inducible genes (e.g., nitrate reductase genes, gluconasegenes, chitinase genes, PR-1 genes etc.), dark-inducible genes (e.g.,asparagine synthetase gene as described by U.S. Pat. No. 5,256,558), anddevelopmental-stage specific genes (e.g., Shoot Meristemless gene, AB13promoter and the 2S1 and Em 1 promoters for seed development (Devic etal.,1996, Plant Journal 9(2):205-215), and the kin1 and cor6.6 promotersfor seed development (Wang et al., 1995, Plant Molecular Biology,28(4):619-634). Examples of other inducible promoters anddevelopmental-stage specific promoters can be found in Datla et al., inparticular in Table 1 of that publication (Datla et al., 1997,Biotechnology annual review 3:269-296).

[0125] A vector of the invention may also contain a sequence encoding atransit peptide which can be fused in-frame such that it is expressed asa fusion protein.

[0126] Methods which are well known to those skilled in the art can beused to construct vectors and/or expression vectors containing plantglycogenin-like protein coding sequences and appropriatetranscriptional/translational control signals. These methods include,for example, in vitro recombinant DNA techniques, synthetic techniquesand in vivo recombination/genetic recombination. See, for example, thetechniques described in Sambrook et al., 1989, and Ausubel et al., 1989.Alternatively, RNA capable of encoding plant glycogenin-like proteinsequences may be chemically synthesized using, for example,synthesizers. See, for example, the techniques described in Gait, 1984,Oligonucleotide Synthesis, IRL Press, Oxford. In a preferred embodimentof the invention, the techniques described in section 6, example 6, andillustrated in FIG. 6 are used to construct a vector.

[0127] A variety of host-expression vector systems may be utilized toexpress the plant glycogenin-like gene products of the invention. Suchhost-expression systems represent vehicles by which the plantglycogenin-like gene products of interest may be produced andsubsequently recovered and/or purified from the culture or plant (usingpurification methods well known to those skilled in the art), but alsorepresent cells which may, when transformed or transfected with theappropriate nucleic acid molecules, exhibit the plant glycogenin-likeprotein of the invention in situ. These include but are not limited tomicroorganisms such as bacteria (e.g., E. coli, B. subtilis) transformedwith recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expressionvectors containing plant glycogenin-like protein coding sequences; yeast(e.g., Saccharomyces, Pichia) transformed with recombinant yeastexpression vectors containing the plant glycogenin-like protein codingsequences; insect cell systems infected with recombinant virusexpression vectors (e.g., baculovirus) containing the plantglycogenin-like protein coding sequences; plant cell systems infectedwith recombinant virus expression vectors (e.g., cauliflower mosaicvirus, CaMV; tobacco mosaic virus, TMV); plant cell systems transformedwith recombinant plasmid expression vectors (e.g., Ti plasmid)containing plant glycogenin-like protein coding sequences; or mammaliancell systems (e.g., COS, CHO, BHK, 293, 3T3) harboring recombinantexpression constructs containing promoters derived from the genome ofmammalian cells (e.g., metallothionein promoter) or from mammalianviruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5Kpromoter; the cytomegalovirus promoter/enhancer; etc.). In a preferredembodiment of the invention, an expression vector comprising a plantglycogenin-like nucleic acid molecule operably linked to at least onesuitable regulatory sequence is incorporated into a plant by one of themethods described in this section, section 5.3, 5.4 and 5.5 or inexamples 7, 8, 9, and 12.

[0128] In bacterial systems, a number of expression vectors may beadvantageously selected depending upon the use intended for the plantglycogenin-like protein being expressed. For example, when a largequantity of such a protein is to be produced, for the generation ofantibodies or to screen peptide libraries, for example, vectors whichdirect the expression of high levels of fusion protein products that arereadily purified may be desirable. Such vectors include, but are notlimited, to the E. coli expression vector pUR278 (Ruther et al., 1983,EMBO J. 2:1791), in which the plant glycogenin-like coding sequence maybe ligated individually into the vector in frame with the lac Z codingregion so that a fusion protein is produced; pIN vectors (Inouye &Inouye, 1985, Nucleic Acids Res. 13:3101-9; Van Heeke & Schuster, 1989,J. Biol. Chem. 264:5503-9); and the like. pGEX vectors may also be usedto express foreign polypeptides as fusion proteins with glutathioneS-transferase (GST). In general, such fusion proteins are soluble andcan easily be purified from lysed cells by adsorption toglutathione-agarose beads followed by elution in the presence of freeglutathione. The pGEX vectors are designed to include thrombin or factorXa protease cleavage sites so that the cloned target gene protein can bereleased from the GST moiety.

[0129] In one such embodiment of a bacterial system, full length cDNAnucleic acid molecules are appended with in-frame Bam HI sites at theamino terminus and Eco RI sites at the carboxyl terminus using standardPCR methodologies (Innis et al., 1990, supra) and ligated into thepGEX-2TK vector (Pharmacia, Uppsala, Sweden). The resulting cDNAconstruct contains a kinase recognition site at the amino terminus forradioactive labeling and glutathione S-transferase sequences at thecarboxyl terminus for affinity purification (Nilsson, et al., 1985, EMBOJ. 4:1075; Zabeau and Stanley, 1982, EMBO J. 1: 1217).

[0130] The recombinant constructs of the present invention may include aselectable marker for propagation of the construct. For example, aconstruct to be propagated in bacteria preferably contains an antibioticresistance gene, such as one that confers resistance to kanamycin,tetracycline, streptomycin, or chloramphenicol. Examples of othersuitable marker genes include antibiotic resistance genes such as thoseconferring resistance to G4 18 and hygromycin (npt-II, hyg-B); herbicideresistance genes such as those conferring resistance to phosphinothricinand sulfonamide based herbicides (bar and suI respectively; EP-A-242246,EP-A-0369637) and screenable markers such as beta-glucoronidase (GB2197653), luciferase and green fluorescent protein. Suitable vectors forpropagating the construct include, but are not limited to, plasmids,cosmids, bacteriophages or viruses.

[0131] The marker gene is preferably controlled by a second promoterwhich allows expression in cells other than the seed, thus allowingselection of cells or tissue containing the marker at any stage ofdevelopment of the plant. Preferred second promoters are the promoter ofnopaline synthase gene of Agrobacterium and the promoter derived fromthe gene which encodes the 35S subunit of cauliflower mosaic virus(CaMV) coat protein. However, any other suitable second promoter may beused.

[0132] The nucleic acid molecule encoding a plant glycogenin-likeprotein may be native or foreign to the plant into which it isintroduced. One of the effects of introducing a nucleic acid moleculeencoding a plant glycogenin-like gene into a plant is to increase theamount of plant glycogenin-like protein present and therefore the amountof starch produced by increasing the copy number of the nucleic acidmolecule. Foreign plant glycogenin-like nucleic acid molecules may inaddition have different temporal and/or spatial specificity for starchsynthesis compared to the native plant glycogenin-like protein of theplant, and so may be useful in altering when and where or what type ofstarch is produced. Regulatory elements of the plant glycogenin-likegenes may also be used in altering starch synthesis in a plant, forexample by replacing the native regulatory elements in the plant orproviding additional control mechanisms. The regulatory regions of theinvention may confer expression of a plant glycogenin-like gene productin a chemically-inducible, dark-inducible, developmentally regulated,developmental-stage specific, wound-induced, environmentalfactor-regulated, organ-specific, cell-specific, tissue-specific, orconstitutive manner. Alternatively, the expression conferred by aregulatory region may encompass more than one type of expressionselected from the group consisting of chemically-inducible,dark-inducible, developmentally regulated, developmental-stage specific,wound-induced, environmental factor-regulated, organ-specific,cell-specific, tissue-specific, and constitutive.

[0133] Further, any of the nucleic acid molecules (including EST clonenucleic acid molecules) and/or polypeptides and proteins describedherein, can be used as markers for qualitative trait loci in breedingprograms for crop plants. To this end, the nucleic acid molecules,including, but not limited to, full length plant glycogenin-like genescoding sequences, and/or partial sequences (ESTs), can be used inhybridization and/or DNA amplification assays to identify the endogenousplant glycogenin-like genes, plant glycogenin-like gene mutant allelesand/or plant glycogenin-like gene expression products in cultivars ascompared to wild-type plants. They can also be used as markers forlinkage analysis of qualitative trait loci. It is also possible that theplant glycogenin-like genes may encode a product responsible for aqualitative trait that is desirable in a crop breeding program.Alternatively, the plant glycogenin-like protein and/or peptides can beused as diagnostic reagents in immunoassays to detect expression of theplant glycogenin-like genes in cultivars and wild-type plants.

[0134] Genetically-engineered plants containing constructs comprisingthe plant glycogenin-like nucleic acid and a reporter gene can begenerated using the methods described herein for each plantglycogenin-like nucleic acid gene variant, to screen forloss-of-function variants induced by mutations, including but notlimited to, deletions, point mutations, rearrangements, translocation,etc. The constructs can encode for fusion proteins comprising a plantglycogenin-like protein fused to a protein product encoded by a reportergene. Alternatively, the constructs can encode for a plantglycogenin-like protein and a reporter gene product that are not fused.The constructs may be transformed into the homozygous recessive plantglycogenin-like gene mutant background, and the restorative phenotypeexamined, i.e. quantity and quality of starch, as a complementation testto confirm the functionality of the variants isolated.

5.2 Plant Glycogenin-Like Gene Products

[0135] The invention encompasses the polypeptides of SEQ ID NOs: 3, 7,11, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 31, 32, or 34. Plantglycogenin-like proteins, polypeptides and peptide fragments, variants,allelic variants, mutated, truncated or deleted forms of plantglycogenin-like proteins and/or plant glycogenin-like fusion proteinscan be prepared for a variety of uses, including, but not limited to,the generation of antibodies, as reagents in assays, the identificationof other cellular gene products involved in starch synthesis and/orstarch synthesis initiation, etc.

[0136] Plant glycogenin-like translational products include, but are notlimited to those proteins and polypeptides encoded by the sequences ofthe plant glycogenin-like nucleic acid molecules of the invention. Theinvention encompasses proteins that are functionally equivalent to theplant glycogenin-like gene products of the invention.

[0137] The primary use of the plant glycogenin-like gene products of theinvention is to alter starch synthesis via increasing the number ofpriming or initiation sites for elongation of glucose chains.

[0138] In an embodiment of the invention, an isolated polypeptidecomprises the amino acid molecule of SEQ ID NO: 9, or a variant orfragment thereof provided the amino acid sequence is not that of SEQ IDNO: 35.

[0139] The present invention also provides variants of the polypeptidesof the invention. Such variants have an altered amino acid sequencewhich can function as either agonists (mimetics) or as antagonists.Variants can be generated by mutagenesis, e.g., discrete point mutationor truncation. An agonist can retain substantially the same, or asubset, of the biological activities of the naturally occurring form ofthe protein. An antagonist of a protein can inhibit one or more of theactivities of the naturally occurring form of the protein by, forexample, deleting one or more of the receiver domains. Thus, specificbiological effects can be elicited by addition of a variant of limitedfunction.

[0140] Modification of the structure of the subject polypeptides can befor such purposes as enhancing efficacy, stability, orpost-translational modifications (e.g., to alter the phosphorylationpattern of the protein). Such modified peptides, when designed to retainat least one activity of the naturally-occurring form of the protein, orto produce specific antagonists thereof, are considered functionalequivalents of the polypeptides. Such modified peptides can be produced,for instance, by amino acid substitution, deletion, or addition.

[0141] For example, it is reasonable to expect that an isolatedreplacement of a leucine with an isoleucine or valine, an aspartate witha glutamate, a threonine with a serine, or a similar replacement of anamino acid with a structurally related amino acid (i.e. isosteric and/orisoelectric mutations) will not have a major effect on the biologicalactivity of the resulting molecule.

[0142] Whether a change in the amino acid sequence of a peptide resultsin a functional homolog (e.g., functional in the sense that theresulting polypeptide mimics or antagonizes the wild-type form) can bereadily determined by assessing the ability of the variant peptide toproduce a response in cells in a fashion similar to the wild-typeprotein, or competitively inhibit such a response. Polypeptides in whichmore than one replacement has taken place can readily be tested in thesame manner.

[0143] In a preferred embodiment, a mutant polypeptide that is a variantof a polypeptide of the invention can be assayed for: (1) the ability tocomplement glycogenin function in a yeast or plant system in which thenative glycogenin or plant glygogenin-like genes have been knocked out;(2) the ability to form a complex with a glucose or oligosaccharide; or(3) the ability to promote initiation of elongation of polysaccharidechains.

[0144] The invention encompasses functionally equivalent mutant plantglycogenin-like proteins and polypeptides. The invention alsoencompasses mutant plant glycogenin-like proteins and polypeptides thatare not functionally equivalent to the gene products. Such a mutantplant glycogenin-like protein or polypeptide may contain one or moredeletions, additions or substitutions of plant glycogenin-like aminoacid residues within the amino acid sequence encoded by any one theplant glycogenin-like nucleic acid molecules described above in Section5.1, and which result in loss of one or more functions of the plantglycogenin-like protein, thus producing a plant glycogenin-like geneproduct not functionally equivalent to the wild-type plantglycogenin-like protein.

[0145] Plant glycogenin-like proteins and polypeptides bearing mutationscan be made to plant glycogenin-like DNA (using techniques discussedabove as well as those well known to one of skill in the art) and theresulting mutant plant glycogenin-like proteins tested for activity.Mutants can be isolated that display increased function, (e.g.,resulting in improved root formation), or decreased function (e.g.,resulting in suboptimal root function). In particular, mutated plantglycogenin-like proteins in which any of the exons shown in SEQ ID NO: 1are deleted or mutated are within the scope of the invention.Additionally, peptides corresponding to one or more exons of the plantglycogenin-like protein, truncated or deleted plant glycogenin-likeprotein are also within the scope of the invention. Fusion proteins inwhich the full length plant glycogenin-like protein or a plantglycogenin-like polypeptide or peptide fused to an unrelated protein arealso within the scope of the invention and can be designed on the basisof the plant glycogenin-like nucleotide and plant glycogenin-like aminoacid sequences disclosed herein.

[0146] While the plant glycogenin-like polypeptides and peptides can bechemically synthesized (e.g., see Creighton, 1983, Proteins: Structuresand Molecular Principles, W. H. Freeman & Co., NY) large polypeptidesderived from plant glycogenin-like gene and the full length plantglycogenin-like gene may advantageously be produced by recombinant DNAtechnology using techniques well known to those skilled in the art forexpressing nucleic acid molecules.

[0147] Nucleotides encoding fusion proteins may include, but are notlimited to, nucleotides encoding full length plant glycogenin-likeproteins, truncated plant glycogenin-like proteins, or peptide fragmentsof plant glycogenin-like proteins fused to an unrelated protein orpeptide, such as for example, an enzyme, fluorescent protein, orluminescent protein that can be used as a marker or an epitope thatfacilitates affinity-based purificaiton. Alternatively, the fusionprotein can further comprise a heterologous protein such as a transitpeptide or fluorescence protein.

[0148] In an embodiment of the invention, the percent identity betweentwo polypeptides of the invention is at least 40%. In a preferredembodiment of the invention, the percent identity between twopolypeptides of the invention is at least 50%. In another embodiment,the percent the percent identity between two polypeptides of theinvention is at least 60%, 70%, 80%, 90%, 95%, 96%, 97%, or at least98%. Determining whether two sequences are substantially similar may becarried out using any methodologies known to one skilled in the art,preferably using computer assisted analysis as described in section 5.1.

[0149] Further, it may be desirable to include additional DNA sequencesin the protein expression constructs. Examples of additional DNAsequences include, but are not limited to, those encoding: a 3′untranslated region; a transcription termination and polyadenylationsignal; an intron; a signal peptide (which facilitates the secretion ofthe protein); or a transit peptide (which targets the protein to aparticular cellular compartment such as the nucleus, chloroplast,mitochondria or vacuole). The nucleic acid molecules of the inventionwill preferably comprise a nucleic acid molecule encoding a transitpeptide, to ensure delivery of any expressed protein to the plastid.Preferably the transit peptide will be selective for plastids such asamyloplasts or chloroplasts, and can be native to the nucleic acidmolecule of the invention or derived from known plastid sequences, suchas those from the small subunit of the ribulose bisphosphate carboxylaseenzyme (ssu of rubisco) from pea, maize or sunflower for example.Transit peptide comprising amino acid residues 1-195 of SEQ ID NO: 2 isan example of a transit peptide native to the polypeptide of theinvention. Where an agonist or antagonist which modulates activity ofthe plant glycogenin-like protein is a polypeptide, the polypeptideitself must be appropriately targeted to the plastids, for example bythe presence of plastid targeting signal at the N terminal end of theprotein (Castro Silva Filho et al Plant Mol Biol 30 769-780 (1996) or byprotein-protein interaction (Schenke PC et al, Plant Physiol 122 235-241(2000) and Schenke et al PNAS 98(2) 765-770 (2001). The transit peptidesof the invention are used to target transportation of plantglycogenin-like proteins as well as agonists or antagonists thereof toplastids, the sites of starch synthesis, thus altering the starchsynthesis process and resulting starch characteristics.

[0150] The plant glycogenin-like proteins and transit peptidesassociated with the plant glycogenin-like genes of the present inventionhave a number of important agricultural uses. The transit peptidesassociated with the plant glycogenin-like genes of the invention may beused, for example, in transportation of desired heterologous geneproducts to a root, a root modified through evolution, tuber, stem, astem modified through evolution, seed, and/or endosperm of transgenicplants transformed with such constructs.

[0151] The invention encompasses methods of screening for agents (i.e.,proteins, small molecules, peptides) capable of altering the activity ofa plant glycogenin-like protein in a plant. Variants of a protein of theinvention which function as either agonists (mimetics) or as antagonistscan be identified by screening combinatorial libraries of mutants, e.g.,truncation mutants, of the protein of the invention for agonist orantagonist activity. In one embodiment, a variegated library of variantsis generated by combinatorial mutagenesis at the nucleic acid level andis encoded by a variegated gene library. A variegated library ofvariants can be produced by, for example, enzymatically ligating amixture of synthetic oligonucleotides into nucleic acid molecules suchthat a degenerate set of potential protein sequences is expressible asindividual polypeptides, or alternatively, as a set of larger fusionproteins (e.g., for phage display). There are a variety of methods whichcan be used to produce libraries of potential variants of thepolypeptides of the invention from a degenerate oligonucleotidesequence. Methods for synthesizing degenerate oligonucleotides are knownin the art (see, e.g., Narang, 1983, Tetrahedron 39:3; Itakura et al.,1984, Annu. Rev. Biochem. 53:323; Itakura et al., 1984, Science198:1056; Ike et al., 1983, Nucleic Acid Res. 11:477).

[0152] In addition, libraries of fragments of the coding sequence of apolypeptide of the invention can be used to generate a variegatedpopulation of polypeptides for screening and subsequent selection ofvariants. For example, a library of coding sequence fragments can begenerated by treating a double stranded PCR fragment of the codingsequence of interest with a nuclease under conditions wherein nickingoccurs only about once per molecule, denaturing the double stranded DNA,renaturing the DNA to form double stranded DNA which can includesense/antisense pairs from different nicked products, removing singlestranded portions from reformed duplexes by treatment with S1 nuclease,and ligating the resulting fragment library into an expression vector.By this method, an expression library can be derived which encodesN-terminal and internal fragments of various sizes of the protein ofinterest.

[0153] Several techniques are known in the art for screening geneproducts of combinatorial libraries made by point mutations ortruncation, and for screening cDNA libraries for gene products having aselected property. The most widely used techniques, which are amenableto high through-put analysis, for screening large gene librariestypically include cloning the gene library into replicable expressionvectors, transforming appropriate cells with the resulting library ofvectors, and expressing the combinatorial genes under conditions inwhich detection of a desired activity facilitates isolation of thevector encoding the gene whose product was detected. Recursive ensemblemutagenesis (REM), a technique which enhances the frequency offunctional mutants in the libraries, can be used in combination with thescreening assays to identify variants of a protein of the invention(Arkin and Yourvan, 1992, Proc. Natl. Acad. Sci. USA 89:7811-7815;Delgrave et al., 1993, Protein Engineering 6(3):327-331).

[0154] An isolated polypeptide of the invention, or a fragment thereof,can be used as an immunogen to generate antibodies using standardtechniques for polyclonal and monoclonal antibody preparation. Thefull-length polypeptide or protein can be used or, alternatively, theinvention provides antigenic peptide fragments for use as immunogens. Inone embodiment, the antigenic peptide of a protein of the invention orfragments or immunogenic fragments of a protein of the inventioncomprise at least 8 (preferably 10, 15, 20, 30 or 35) consecutive aminoacid residues of the amino acid sequence of SEQ ID NO: 3, 7, 9, 11, 13,15, 17, 19, 21, 22, 24, 26, 28, 30, 32, or 34 and encompasses an epitopeof the protein such that an antibody raised against the peptide forms aspecific immune complex with the protein.

[0155] Exemplary amino acid sequences of the polypeptides of theinvention can be used to generate antibodies against plantglycogenin-like genes. In one embodiment, the immunogenic polypeptide isconjugated to keyhole limpet hemocyanin (“KLH”) and injected intorabbits. Rabbit IgG polyclonal antibodies can purified, for example, ona peptide affinity column. The antibodies can them be used to bind toand identify the polypeptides of the invention that have been extractedand separated via gel electrophoresis or other means.

[0156] One aspect of the invention pertains to isolated plantglycogenin-like polypeptides of the invention, variants thereof, as wellas variants suitable for use as immunogens to raise antibodies directedagainst a plant glycogenin-like polypeptide of the invention. In oneembodiment, the native polypeptide can be isolated, using standardprotein purification techniques, from cells or tissues expressing aplant glycogenin-like polypeptide. In a preferred embodiment, plantglycogenin-like polypeptides of the invention are produced fromexpression vectors by recombinant DNA techniques. In another preferredembodiment, a polypeptide of the invention is synthesized chemicallyusing standard peptide synthesis techniques.

[0157] An isolated or purified protein or biologically active portionthereof is substantially free of cellular material or othercontaminating proteins from the cell or tissue source from which theprotein is derived, or substantially free of chemical precursors orother chemicals when chemically synthesized. The language “substantiallyfree” indicates protein preparations in which the protein is separatedfrom cellular components of the cells from which it is isolated orrecombinantly produced. Thus, protein that is substantially free ofcellular material includes protein preparations having less than 20%,10%, or 5% (by dry weight) of a contaminating protein. Similarly, whenan isolated plant glycogenin-like polypeptide of the invention isrecombinantly produced, it is substantially free of culture medium. Whenthe plant glycogenin-like polypeptide is produced by chemical synthesis,it is preferably substantially free of chemical precursors or otherchemicals.

[0158] Biologically active portions of a polypeptide of the inventioninclude polypeptides comprising amino acid sequences identical to orderived from the amino acid sequence of the protein, such that thevariants sequences comprise conservative substitutions or truncations(e.g., amino acid sequences comprising fewer amino acids than thoseshown in any of SEQ ID NOs: 3, 7, 9, 11, 13, 15, 17, 19, 21, 22, 24, 26,28, 30, 32, and 34, but which maintain a high degree of homology to theremaining amino acid sequence). Typically, biologically active portionscomprise a domain or motif with at least one activity of thecorresponding protein. Domains or motifs include, but are not limitedto, a biologically active portion of a protein of the invention can be apolypeptide which is, for example, at least 10, 25, 50, 100, 200, 300,400 or 500 amino acids in length. Polypeptides of the invention cancomprise, for example, a glycosylation domain or site for complexingwith polysaccharide or for attachment of disaccharide or a monomericunit thereof, or a site that interacts with starch synthase and otherenzymes that act on the polysaccharide.

5.3 Production of Transgenic Plants and Plant Cells

[0159] The invention also encompasses transgenic orgenetically-engineered plants, and progeny thereof. As used herein, atransgenic or genetically-engineered plant referes to a plant and aportion of its progeny which comprises a nucleic acid molecule which isnot native to the initial parent plant. The introduced nucleic acidmolecule may originate from the same species e.g., if the desired resultis over-expression of the endogenous gene, or from a different species.A transgenic or genetically-engineered plant may be easily identified bya person skilled in the art by comparing the genetic material from anon-transformed plant, and a plant produced by a method of the presentinvention for example, a transgenic plant may comprise multiple copiesof plant glycogenin-like genes, and/or foreign nucleic acid molecules.Transgenic plants are readily distinguishable from non-transgenic plantsby standard techniques. For example a PCR test may be used todemonstrate the presence or absence of introduced genetic material.Transgenic plants may also be distinguished from non-transgenic plantsat the DNA level by Southern blot or at the RNA level by Northern blotor at the protein level by western blot, by measurement of enzymeactivity or by starch composition or properties.

[0160] The nucleic acids of the invention may be introduced into a cellby any suitable means. Preferred means include use of a disarmedTi-plasmid vector carried by Agrobacterium by procedures known in theart, for example as described in EP-A-0116718 and EP-A-0270822.Agrobacterium mediated transformation methods are now available formonocots, for example as described in EP 0672752 and WO00/63398.Alternatively, the nucleic acid may be introduced directly into plantcells using a particle gun. A further method would be to transform aplant protoplast, which involves first removing the cell wall andintroducing the nucleic acid molecule and then reforming the cell wall.The transformed cell can then be grown into a plant.

[0161] In an embodiment of the present invention, Agrobacterium isemployed to introduce the gene constructs into plants. Suchtransformations preferably use binary Agrobacterium T-DNA vectors(Bevan, 1984, Nuc. Acid Res. 12:8711-21), and the co-cultivationprocedure (Horsch et al., 1985, Science 227:1229-31). Generally, theAgrobacterium transformation system is used to engineer dicotyledonousplants (Bevan et al., 1982, Ann. Rev. Genet. 16:357-84; Rogers et al.,1986, Methods Enzymol. 118:627-41). The Agrobacterium transformationsystem may also be used to transform, as well as transfer, DNA tomonocotyledonous plants and plant cells (see Hernalsteen et al., 1984,EMBO J. 3:3039-41; Hooykass-Van Slogteren et al., 1984, Nature311:763-4; Grimsley et al., 1987, Nature 325:1677-79; Boulton et al.,1989, Plant Mol. Biol. 12:31-40.; Gould et al., 1991, Plant Physiol.95:426-34).

[0162] Various alternative methods for introducing recombinant nucleicacid constructs into plants and plant cells may also be utilized. Theseother methods are particularly useful where the target is amonocotyledonous plant or plant cell. Alternative gene transfer andtransformation methods include, but are not limited to, protoplasttransformation through calcium-, polyethylene glycol (PEG)- orelectroporation-mediated uptake of naked DNA (see Paszkowski et al.,1984, EMBO J. 3:2717-22; Potrykus et al., 1985, Mol. Gen. Genet.199:169-177; Fromm et al., 1985, Proc. Natl. Acad. Sci. USA 82:5824-8;Shimamoto, 1989, Nature 338:274-6), and electroporation of plant tissues(D'Halluin et al., 1992, Plant Cell 4:1495-1505). Additional methods forplant cell transformation include microinjection, silicon carbidemediated DNA uptake (Kaeppler et al., 1990, Plant Cell Reporter9:415-8), and microprojectile bombardment (Klein et al., 1988, Proc.Natl. Acad. Sci. USA 85:4305-9; Gordon-Kamm et al., 1990, Plant Cell2:603-18).

[0163] According to the present invention, desired plants and plantcells may be obtained by engineering the gene constructs describedherein into a variety of plant cell types, including, but not limitedto, protoplasts, tissue culture cells, tissue and organ explants,pollen, embryos as well as whole plants. In an embodiment of the presentinvention, the engineered plant material is selected or screened fortransformants (i.e., those that have incorporated or integrated theintroduced gene construct or constructs) following the approaches andmethods described below. An isolated transformant may then beregenerated into a plant. Alternatively, the engineered plant materialmay be regenerated into a plant, or plantlet, before subjecting thederived plant, or plantlet, to selection or screening for the markergene traits. Procedures for regenerating plants from plant cells,tissues or organs, either before or after selecting or screening formarker gene or genes, are well known to those skilled in the art.

[0164] A transformed plant cell, callus, tissue or plant may beidentified and isolated by selecting or screening the engineered plantmaterial for traits encoded by the marker genes present on thetransforming DNA. For instance, selection may be performed by growingthe engineered plant material on media containing inhibitory amounts ofthe antibiotic or herbicide to which the transforming marker geneconstruct confers resistance. Further, transformed plants and plantcells may also be identified by screening for the activities of anyvisible marker genes (e.g., the β-glucuronidase, luciferase, greenfluorescent protein, B or C1 anythocyanin genes) that may be present onthe recombinant nucleic acid constructs of the present invention. Suchselection and screening methodologies are well known to those skilled inthe art.

[0165] The present invention is applicable to all plants which produceor store starch. Examples of such plants are cereals such as maize,wheat, rice, sorghum, barley; fruit producing species such as banana,apple, tomato or pear; root crops such as cassaya, potato, yam, beet orturnip; oilseed crops such as rapeseed, canola, sunflower, oil palm,coconut, linseed or groundnut; meal crops such as soya, bean or pea; andany other suitable species.

[0166] In a preferred embodiment of the present invention, the methodcomprises the additional step of growing the plant and harvesting thestarch from a plant part. In order to harvest the starch, it ispreferred that the plant is grown until plant parts containing starchdevelop, which may then be removed. In a further preferred embodiment,the propagating material from the plant may be removed, for example theseeds. The plant part can be an organ such as a stem, root, leaf, orreproductive body. Alternatively, the plant part may be a modified organsuch as a tuber, or the plant part is a tissue such as endosperm.

5.4 Transgenic Plants That Ectopically Express Plant Glycogenin-LikeProtein

[0167] According to one aspect of the invention, the nucleic acidmolecule expressed in the plant cell, plant, or part of a plant thatcomprises a nucleotide sequence encoding a plant glycogenin-likeprotein, fragment of variant thereof. The nucleic acid moleculeexpressed in the plant cell can comprise a nucleotide sequence encodinga full length plant glycogenin-like protein. Examples of such sequencesinclude SEQ ID NOs: 1, 2, 6, 8, 10, 12, and 14, or variants thereof andthe corresponding the amino acid sequences of SEQ ID NOs: 3, 7, 9, 11,13, and 15 or variants thereof.

[0168] In an embodiment of the invention, the nucleic acid molecules ofthe invention are expressed in a plant cell and are transcribed only inthe sense orientation. A plant that expresses a recombinant plantglycogenin-like nucleic acid may be engineered by transforming a plantcell with a nucleic acid construct comprising a regulatory regionoperably associated with a nucleic acid molecule, the sequence of whichencodes a plant glycogenin-like protein or a fragment thereof. In plantsderived from such cells, starch synthesis is altered in ways describedin section 5.6. The term “operably associated” is used herein to meanthat transcription controlled by the associated regulatory region wouldproduce a functional mRNA, whose translation would produce the plantglycogenin-like protein. Starch may be altered in particular parts of aplant, including but not limited to seeds, tubers, leaves, roots andstems or modifications thereof.

[0169] In an embodiment of the invention, a plant is engineered toconstitutively express a plant glycogenin-like protein in order to alterthe starch content of the plant. In a preferred embodiment, the starchcontent is 40%, 30%, 20%, 10%, 5%, 2% greater than that ofanon-engineered control plant(s). In another preferred embodiment, thestarch content is 40%, 30%, 20%, 10%, 5%, 2% less than that of anon-engineered control plant(s).

[0170] In another aspect of the invention, where the nucleic acidmolecules of the invention are expressed in a plant cell and aretranscribed only in the sense orientation, the starch content of theplant cell and plants derived from such a cells exhibit altered starchcontent. The altered starch content comprises an increase in the ratioof amylose to amylopectin. In one embodiment of the invention, the ratioof amylose to amylopectin increases by 2%, 5%, 10%, 20%, 30%, 40%, or50% in comparison to a non-engineered control plant(s).

[0171] In preferred embodiment of the invention, the nucleic acidmolecules of the invention are expressed in a potato plant and aretranscribed only in the sense orientation. The starch content of theplant, including the tubers, exhibit increased starch content. If thenumber of copies of the nucleic acid molecules of the invention areexpressed in a potato plant that are transcribed only in the senseorientation is increased, the starch content of the plant, including thetubers, increases.

[0172] In yet another embodiment of the present invention, it may beadvantageous to transform a plant with a nucleic acid construct operablylinking a modified or artificial promoter to a nucleic acid moleculehaving a sequence encoding a plant glycogenin-like protein or a fragmentthereof. Such promoters typically have unique expression patterns and/orexpression levels not found in natural promoters because they areconstructed by recombining structural elements from different promoters.See, e.g., Salina et al., 1992, Plant Cell 4:1485-93, for examples ofartificial promoters constructed from combining cis-regulatory elementswith a promoter core.

[0173] In a preferred embodiment of the present invention, theassociated promoter is a strong root and/or embryo-specific plantpromoter such that the plant glycogenin-like protein is overexpressed inthe transgenic plant.

[0174] In yet another preferred embodiment of the present invention, theoverexpression of plant glycogenin-like protein in starch producingorgans and organelles may be engineered by increasing the copy number ofthe plant glycogenin-like gene. One approach to producing suchtransgenic plants is to transform with nucleic acid constructs thatcontain multiple copies of the complete plant glycogenin-like gene withnative or heterolgous promoters. Another approach is repeatedlytransform successive generations of a plant line with one or more copiesof the complete plant glycogenin-like gene constructs. Yet anotherapproach is to place a complete plant glycogenin-like gene in a nucleicacid construct containing an amplification-selectable marker (ASM) genesuch as the glutamine synthetase or dihydrofolate reductase gene. Cellstransformed with such constructs is subjected to culturing regimes thatselect cell lines with increased copies of complete plantglycogenin-like gene. See, e.g., Donn et al., 1984, J. Mol. Appl. Genet.2:549-62, for a selection protocol used to isolate of a plant cell linecontaining amplified copies of the GS gene. Cell lines with amplifiedcopies of the plant glycogenin-like gene can then be regenerated intotransgenic plants.

5.5 Transgenic Plants That Suppress Endogenous Plant Glycogenin-LikeProtein Expression

[0175] The nucleic acid molecules of the invention may also be used toaugment the starch riming activity of a plant cell, plant, or part of aplant, or alternatively to alter activity of the plant glycogenin-likeprotein of a plant cell, plant, or part of a plant by modifyingtranscription or translation of the plant glycogenin-like gene. In anembodiment of the invention, an antagonist which is capable of alteringthe expression of a nucleic acid molecule of the invention is introducedinto a plant in order to alter the synthesis of starch. The antagonistmay be protein, nucleic acid, chemical antagonist, or any other suitablemoiety. In an embodiment of the invention, an antagonist which iscapable of altering the expression of a nucleic acid molecule of theinvention is provided to alter the synthesis of starch. The antagonistmay be protein, nucleic acid, chemical antagonist, or any other suitablemoiety. Typically, the antagonist will function by inhibiting orenhancing transcription from the plant glycogenin-like gene, either byaffecting regulation of the promoter or the transcription process;inhibiting or enhancing translation of any RNA product of the plantglycogenin-like gene; inhibiting or enhancing the activity of the plantglycogenin-like protein itself or inhibiting or enhancing theprotein-protein interaction of the plant glycogenin-like protein anddownstream enzymes of the starch biosynthesis pathway. For example,where the antagonist is a protein it may interfere with transcriptionfactor binding to the plant glycogenin-like gene promoter, mimic theactivity of a transcription factor, compete with or mimic the plantglycogenin-like protein, or interfere with translation of the plantglycogenin-like RNA, interfere with the interaction of the plantglycogenin-like protein and downstream enzymes. Antagonists which arenucleic acids may encode proteins described above, or may be transposonswhich interfere with expression of the plant glycogenin-like gene.

[0176] The suppression may be engineered by transforming a plant with anucleic acid construct encoding an antisense RNA or ribozymecomplementary to a segment or the whole of plant glycogenin-like geneRNA transcript, including the mature target mRNA. In another embodiment,plant glycogenin-like gene suppression may be engineered by transforminga plant cell with a nucleic acid construct encoding a ribozyme thatcleaves the plant glycogenin-like gene mRNA transcript.

[0177] In another embodiment, the plant glycogenin-like mRNA transcriptcan be suppressed through the use of RNA interference, referred toherein as RNAi. RNAi allows for selective knock out of a target gene ina highly effective and specific manner. The RNAi technique involvesintroducing into a cell double-stranded RNA (dsRNA) which corresponds toexon portions of a target gene such as an endogenous plantglycogenin-like gene. The dsRNA causes the rapid destruction of thetarget gene's messenger RNA, i.e. an endogenous plant glycogenin-likegene mRNA, thus preventing the production of the plant glycogenin-likeprotein encoded by that gene. The RNAi constructs of the inventionconfer expression of dsRNA which correspond to exon portions of anendogenous plant glycogenin-like gene. The strands of RNA that form thedsRNA are complimentary strands from encoded by coding region, i.e.,exons encoding sequence, on the 3′ end of the plant glycogenin-likegene.

[0178] The dsRNA has an effect on the stability of the mRNA. Themechanism of how dsRNA results in the loss of the targeted homologousmRNA is still not well understood (Cogoni and Macino, 2000, Genes Dev10: 638-643; Guru, 2000, Nature 404, 804-808; Hammond et al., 2001,Nature Rev Gen 2: 110-119). Current theories suggest a catalytic oramplification process occurs that involves initiation step and aneffector step.

[0179] In the initiation step, input dsRNA is digested into 21-23nucleotide “guide RNAs”. These guide RNAs are also referred to assiRNAs, or short interfering RNAs. Evidence indicates that siRNAs areproduced when a nuclease complex, which recognizes the 3′ ends of dsRNA,cleaves dsRNA (introduced directly or via a transgene or virus) ˜22nucleotides from the 3′ end. Successive cleavage events, either by onecomplex or several complexes, degrade the RNA to 19-20 bp duplexes(siRNAs), each with 2-nucleotide 3′ overhangs. RNase III-typeendonucleases cleave dsRNA to produce dsRNA fragments with 2-nucleotide3′ tails, thus an RNase III-like activity appears to be involved in theRNAi mechanism. Because of the potency of RNAi in some organisms, it hasbeen proposed that siRNAs are replicated by an RNA-dependent RNApolymerase (Hammond et al., 2001, Nature Rev Gen 2:110-119; Sharp, 2001,Genes Dev 15: 485-490).

[0180] In the effector step, the siRNA duplexes bind to a nucleasecomplex to form what is known as the RNA-induced silencing complex, orRISC. The nuclease complex responsible for digestion of mRNA may beidentical to the nuclease activity that processes input dsRNA to siRNAs,although its identity is currently unclear. In either case, the RISCtargets the homologous transcript by base pairing interactions betweenone of the siRNA strands and the endogenous mRNA. It then cleaves themRNA ˜12 nucleotides from the 3′ terminus of the siRNA (Hammond et al.,2001, Nature Rev Gen 2:110-119; Sharp, 2001, Genes Dev 15: 485-490).

[0181] Methods and procedures for successful use of RNAi technology inpost-transcriptional gene silencing in plant systems has been describedby Waterhouse et al. (Waterhouse et al., 1998, Proc Natl Acad Sci USA,95(23):13959-64). Methods specific to construction of the RNAiconstructs of the invention can be found in Examples 2 and 6 as well asFIGS. 6 and 10. While the invention encompasses use of any plantglycogenin-like gene of the invention in the RNAi constructs, in apreferred embodiment, the strands of RNA that form the dsRNA arecomplimentary strands encoded by a coding region on the 3′ end fromnucleotide residues 1196-1662 of SEQ ID NO:2.

[0182] For all of the aforementioned suppression or antisenseconstructs, it is preferred that such nucleic acid constructs expressspecifically in organs where starch synthesis occurs (i.e. tubers,seeds, stems roots and leaves) and/or the plastids where starchsynthesis occurs. Alternatively, it may be preferred to have thesuppression or antisense constructs expressed constitutively. Thus,constitutive promoters, such as the nopaline, CaMV 35S promoter, mayalso be used to express the suppression constructs. A most preferredpromoter for these suppression or antisense constructs is a rice actinpromoter. Alternatively, a co-suppression construct promoter can be onethat expresses with the same tissue and developmental specificity as theplant glycogenin-like gene.

[0183] In accordance with the present invention, desired plants withsuppressed target gene expression may also be engineered by transforminga plant cell with a co-suppression construct. A co-suppression constructcomprises a functional promoter operatively associated with a completeor partial plant glycogenin-like nucleic acid molecule. According to thepresent invention, it is preferred that the co-suppression constructencodes fully functional plant glycogenin-like gene mRNA or enzyme,although a construct encoding a an incomplete plant glycogenin-like genemRNA may also be useful in effecting co-suppression.

[0184] In accordance with the present invention, desired plants withsuppressed target gene expression may also be engineered by transforminga plant cell with a construct that can effect site-directed mutagenesisof the plant glycogenin-like gene. For discussions of nucleic acidconstructs for effecting site-directed mutagenesis of target genes inplants see, e.g., Mengiste et al., 1999, Biol. Chem. 380:749-758;Offringa et al., 1990, EMBO J. 9:3077-84; and Kanevskii et al., 1990,Dokl. Akad. Nauk. SSSR 312:1505-7. It is preferred that such constructseffect suppression of plant glycogenin-like genes by replacing theendogenous plant glycogenin-like gene nucleic acid molecule throughhomologous recombination with either an inactive or deleted plantglycogenin-like protein coding nucleic acid molecule.

[0185] In yet another embodiment, antisense technology can be used toinhibit plant glycogenin-like gene mRNA expression. Alternatively, theplant can be engineered, e.g., via targeted homologous recombination toinactive or “knock-out” expression of the plant's endogenous plantglycogenin-like protein. The plant can be engineered to express anantagonist that hybridizes to one or more regulatory elements of thegene to interfere with control of the gene, such as binding oftranscription factors, or disrupting protein-protein interaction. Theplant can also be engineered to express a co-suppression construct. Thesuppression technology may also be useful in down-regulating the nativeplant glycogenin-like gene of a plant where a foreign plantglycogenin-like gene has been introduced. To be effective in alteringthe activity of a plant glycogenin-like protein in a plant, it ispreferred that the nucleic acid molecules are at least 50, preferably atleast 100 and more preferably at least 150 nucleotides in length. In oneaspect of the invention, the nucleic acid molecule expressed in theplant cell can comprise a nucleotide sequence of the invention whichencodes a full length plant glycogenin-like protein and wherein thenucleic acid molecule has been transcribed only in the antisensedirection.

[0186] In a particular embodiment of the invention, a plant isengineered to express a dsRNA homologous to a portion of the codingregion of an endogeneous PGSIP or a plant glycogenin-like genetranscribed in the antisense direction in order to alter the starchcontent of the plant. In a preferred embodiment, the starch content is40%, 30%, 20%, 10%, 5% less than that of a non-engineered controlplant(s). In a another preferred embodiment, starch is absent fromcertain plant organs or tissues in comparison to a non-engineeredcontrol plant(s). In one embodiment starch content is decreased orabsent in the leaves of plants engineered using the antisense technologydescribed herein when compared to the starch content in a non-engineeredcontrol plant(s). In other embodiments the starch content of tubers, orseeds is decreased or absent in plants engineered using the antisensetechnology described herein when compared to the starch content in anon-engineered control plant(s). Plant tissues in which starch contentcan be decreased using the methods of the invention include but are notlimited to endospern, leaf mesophyll, and root or stem cortex or pith.

[0187] In another aspect of the invention, the nucleic acid molecules ofthe invention are expressed in a plant cell engineered expressing adsRNA homologous to a portion of the coding region of an endogeneousPGSIP or using the antisense technology described herein and the starchcontent of the plant cell and plants derived from such a cells exhibitaltered starch content. The altered starch content comprises an decreasein the ratio of amylose to amylopectin. In one embodiment of theinvention, the ratio of amylose to amylopectin decreases by 10%, 20%,30%, 40%, or 50% in comparison to a non-engineered control plant(s).

[0188] In a particular embodiment, the nucleic acid molecules of theinvention are expressing a dsRNA homologous to a portion of the codingregion of an endogeneous PGSIP or using the antisense technologydescribed herein, in conjunction with a developmental specific promoterdirected towards later stages of development. In this particularembodiment, starch content in leaves of a plant can decrease, whilestarch content in other organs and tissues of a plant are altered in thesame or different ways.

[0189] In another particular embodiment, the nucleic acid molecules ofthe invention are expressing a dsRNA homologous to a portion of thecoding region of an endogeneous PGSIP or using the antisense technologydescribed herein in conjunction with a developmental specific promoterdirected towards later stages of seed development, in cereals crops. Inthis embodiment, the ratio of small starch granules to large starchgranules increases. An increased ratio of small to large starch granulesresults in greater accessibility of starch granules, which has certainindustrial and commercial advantages related to extraction andprocessing of starch.

[0190] The progeny of the transgenic or genetically-engineered plants ofthe invention containing the nucleic acids of the invention are alsoencompassed by the invention.

5.6 Modified Starch

[0191] The invention encompasses methods of altering starch synthesis ina plant and the resulting modified starch produced.

[0192] In the context of the present invention, “altering starchsynthesis” means altering any aspect of starch production in the plant,from initiation by the starch primer to downstream aspects of starchproduction such as elongation, branching and storage, such that itdiffers from starch synthesis in the native plant. In the invention,this is achieved by altering the activity of the starch primer, whichincludes, but is not limited to, its function in initiating starchsynthesis, its temporal and spatial distribution and specificity, andits interaction with downstream factors in the synthesis pathway. Theeffects of altering the activity of the starch primer may include, forexample, increasing or decreasing the starch yield of the plant;increasing or decreasing the rate of starch production; alteringtemporal or spatial aspects of starch production in the plant; alteringthe initiation sites of starch synthesis; changing the optimumconditions for starch production; and altering the type of starchproduced, for example in terms of the ratio of its different components.For example, the endosperm of mature wheat and barley grains contain twomajor classes of starch granules: large, early formed “A” granules andsmall, later formed “B” granules. Type A starch granules in wheat areabout 20 μm diameter and type B around 5 μm in diameter (Tester, 1997,in: Starch Structure and Functionality, Frazier et al., eds., RoyalSociety of Chemistry, Cambridge, UK). Rice starch granules are typicallyless than 5 μm in diameter, while potato starch granules can be greaterthan 80 μm in diameter. The quality of starch in wheat and barley isgreatly influenced by the ratio of A-granules to B-granules. Alteringthe activity of the starch primer will influence the number of granuleinitiation sites, which will be an important factor in determining thenumber and size of formed starch granules. The degree to which thestarch priming activity of the plant is affected will depend at leastupon the nature and of the nucleic acid molecule or antagonistintroduced into the plant, and the amount present. By altering thesevariables, a person skilled in the art can regulate the degree to whichstarch synthesis is altered according to the desired end result.

[0193] The methods of the invention (i.e. engineering-a plant to expressa construct comprising a plant glycogenin-like nucleic acid) can, inaddition to altering the total quantity of starch, alter the finestructure of starch in several ways including but not limited to,altering the ratio of amylose to amylopectin, altering the length ofamylose chains, altering the length of chains of amylopectin fractionsof low molecular weight or high molecular weight fractions, or alteringthe ratio of low molecular weight or high molecular weight chains ofamylopectin. The methods of the invention can also be utilized to alterthe granule structure of starch, i.e. the ratio of large to small starchgranules from a plant or a portion of a plant. The alteration in thestructure of starch can in turn affect the functional characteristics ofstarch such as viscosity, elasticity, or Theological properties of thestarch such as those measured using viscometric analysis. The modifiedstarch can also be characterized by an alteration of more than one ofthe above-mentioned properties.

[0194] In an embodiment the length of amylose chains in starch extractedfrom a plant engineered express a construct comprising a plantglycogenin-like nucleic acid is decreased by at least 50, 100, 150, 200,250, or 300 glucose units in length in comparison to amylose fromnon-modified starch from a plant of the same genetic background. Inanother embodiment, the length of amylose chains in starch is increasedby at least 50, 100, 150, 200, 250, or 300 glucose units in length incomparison to amylose from non-modified starch from a plant of the samegenetic background.

[0195] In an embodiment of the invention, the ratio of amylose toamylopectin decreases by 10%, 20%, 30%, 40%, or 50% in comparison to anon-engineered control plant(s).

[0196] In a preferred embodiment, the ratio of low molecular weightchains to high molecular weight chains of amylopectin is altered by 10%,20%, 30%, 40%, or 50% in comparison to a non-engineered controlplant(s).

[0197] In another preferred embodiment the average length of lowmolecular weight chains 35 of amylopectin is altered by 5, 10, 15, 20,or 25 glucose units in length in comparison to a non-engineered controlplant(s). In yet another preferred embodiment the average length of highmolecular weight chains of amylopectin is altered by 10, 20, 30, 40, 50,60, 70, or 80 glucose units in length in comparison to a non-engineeredcontrol plant(s).

[0198] According to one aspect of the invention, the ratio of smallstarch granules to large granules is altered by at least 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90% or more in comparison to a non-engineeredcontrol plant(s).

[0199] In another aspect, the invention provides a complex comprisingplant glycogenin-like proteins and plant polysaccharides. The inventorsbelieve that members of the family of plant glycogenin-like proteinsserve as primers for biosynthesis of a range of polysaccharides inplants, including but not limited to starch, hemicelluloses, andcellulose. The plant polysaccharides may be either homopolysaccharidescomprising only a single type of monomeric unit or aheteropolysaccharides comprising two or more different kinds ofmonomeric units. Accordingly, it is contemplated that plantglycogenin-like proteins form complexes with such polysaccharides andits subunits. Glycosylated plant glycogenin-like proteins areencompassed in the invention. In the broadest sense, the inventionencompasses a complex comprising a plant glycogenin-like protein and anumber of monomeric units also referred to as subunits of thepolysaccharides. Examples of monomeric units include but are not limitedto glucose, xylose, mannose, galactose, ribose, and rhamnose, and may bea hexose, or a pentose, wherein the number ranges from a single tothousands of monomeric units, and wherein the linkages between thesubunits may vary resulting in linear and/or branched structures. Forexample, starch and precursors of starch comprise of glucose subunitsjoined by either alpha 1,4-glycosidic bonds or alpha 1,6-glycosidiclinkages; cellulose and precursors of cellulose comprise glucosesubunits joined by beta 1,4-glycosidic bonds. The number of monomericunits ranges from 1-3,2-5,4-10, 8-16, 15-30, 20-40, 30-60, 50-100,75-200, 100-500, or 300-800 monomeric units. Alternatively, the numberof monomeric units ranges from 1000-5000, 5000-10,000, or 10,000-15,000monomeric units. Preferably, the polysaccharide or its precursor isattached to a hydroxyl group of a tyrosine residue of the plantglycogenin-like protein. Without being bound by any theory or anymechanism, during biosynthesis, additional subunits, either singly or asoligosaccharides are added to the complex such that the total number ofsubunits increase over a period of time.

[0200] In one embodiment, the invention encompasses complexes comprisingplant glycogenin-like protein and starch. In a specific embodiment, thecomplexes of plant glycogenin-like protein and starch are purified. Thestarch molecule or its precursor including a single glucose subunit, canbe attached to a hydroxyl group of a tyrosine residue of the plantglycogenin-like protein. In various embodiments, in a population ofcomplexes, the starch molecules that are complexed with the plantglycogenin-like proteins have different chain lengths and branchingstructures, for example, 1-3, 2-5, 4-10, 8-16, 15-30, 20-40, 30-60,50-100, 75-200, 100-500, 200-700 glucose subunits. The polysaccharidecomplexed with the plant glycogenin-like proteins may consists of 2, 3,4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120,130, 140, 150, 160, 170, 180, or 190 glucose subunits in length. Inpreferred embodiments of the invention, the polysaccharide isamylopectin, amylose, or a combination of both.

[0201] The complexes of the invention can be used to identify sites ofstarch synthesis in stages of plant development. Briefly, theglycogenin-like protein can be labeled by means described herein and thecomplexes from tissues, cells, or organs can then be separated by sizeand compared among different stages of development.

[0202] The embodiments described in each section above apply to theother aspects of the invention, mutatis mutandis.

6 EXAMPLES Example 1 Identification of Plant Glycogenin-Like GeneHomologues in Arabidopsis

[0203] Arabidopsis nucleic acid molecules showing similarities to yeastglycogenin genes were identified by sequence analysis. The sequenceanalysis programs used in the following examples are from the WisconsinPackage of computer programs (Deveraux et al., Nucl. Acids Res. 12: 387(1984); available from Genetics Computer Group, Madison, Wis.). ESTs andgenes were identified using the program BLAST (Basic Local AlignmentSearch Tool; Altschul, S. F. et al (1990) J. Mol. Biol. 215:403-410, seealso www.ncbi.nlm.nih.gov/BLAST/).

[0204] The sequence comparison and identification program tblastx wasused with the yeast glycogenin 1 (Glg1) gene (GenBank:U25546, Swiss_Prot(SP):P36143) to search against the Arabidopsis sequences collected in anin-house database comprising published plant sequences. A number of hitsto this gene were obtained. One of the hits was identified asEMBL:AC004260 version GI:2957150 which was annotated as “Sequencing inprogress.” Therefore, the region showing homology to the yeast Glg1 genewas extracted and a protein sequence was predicted using GENSCAN (aprotein prediction program, Burge, C. and Karlin, S. (1997),J.Mol.Biol., http://genes.mit.edu/GENSCANinfo.html). A blastp analysisusing this protein showed strong homology to the glycogenin genes fromC. elegans (8e-22), human (2e-19) and yeast (8e-06). A search in thedatabase at NCBI at a later date showed that this gene is listed asT14N5.1 with the accession number EMBL:AC004260 (SPTREMBL:O80649) andannotated as “Unknown protein”. The protein sequence is set forth in SEQID NO: 6.

[0205] The in-house database described above was also searched with theyeast Glg2 gene (GB:U25436, SP:P47011) and the sequence identified above(accession EMBL:AC004260) using the program tblastn and tblastx. Anumber of further hits were identified. Out of the list of best hits,accession no. EMBL:AB026654, gene_id:MVE11.2 (SPTREMBL:Q9LSB1), showedstrong homology to the glycogenin genes from C.elegans (le-21), GYG2human (3e-21) and yeast (5e-06). The genomic sequence representing thisgene was extracted and is shown in SEQ ID NO: 1. Further analysis by theorganelle prediction programs PREDOTAR and/or TargetP (Emanuelsson etal., J. Mol. Biol. 300: 1005-1016 (2000)) showed that the proteincomprises a transit peptide as shown in Table 1 below. TABLE 1 TargetPV1.0 Prediction Results. Number of input sequences: 1 Cleavage sitepredictions included. Using PLANT networks. Name Length cTP mTP SP OtherLoc. RC TPlen AT3g18660 659 0.792 0.181 0.004 0.172 C 2 63 cDNA

[0206] Performing blastp analysis using this protein against yeastsequences in an in-house database clearly showed sequence similaritiesto the yeast Glg1 and Glg2 gene. were and a CD-ROM containing the fullgenome sequence of Arabidopsis was made available. A search of theArabidopsis genome sequencing project database published (Nature 408:791, (2000)) showed that EMBL:AB026654 corresponded to the sequencehaving accession no. AT3g18660. However AT3g18660 is reported to encodea protein of 575 amino acids whereas our analysis shows that this geneactually encodes a protein of 659 amino acids. A blastp analysis againstthe in-house database showed strong hits to five genes, EMBL:AC004260,AC000106, AC069144, AL035678 and AL035678 (corresponding toMIPS:at1g77130, at1g08990, at1g54940, at4 g33330 and at4 g33340). Thesequences of these five genes are shown in SEQ ID NOs: 6, 8, 10, 12 and14. The different accession numbers of these genes and their descriptionin various databases are presented in Table 2. TABLE 2 Accession numbersof the genes in various databases: MIPS SPTREMBL EMBL GENE SizeAT3g18660 Q9LSB1 AB026654 MVE11.2 659^(a) aa at1g77130 O80649 AC004260T14N5.1 1201 aa at1g08990 O04031 AC000106 F7g19.14 546^(b) aa at1g54940Q9FZ37 AC069144 F14C21.47 557 aa at4g33330 Q9SZB0 AL035678 F17M5.90 333aa at4g33340 Q9SZB1 AL035678 F17M5.100 277 aa

[0207] TABLE 3 Comparison of AT3g18660 with other glycogenin-like genesfrom Arabidopsis: % identity nucleotide % identity protein AT3g18660 Xat1g77130 68 65 AT3g18660 X at1g08990 61 50 AT3g18660 X at1g54940 61 49AT3g18660 X at4g33330 60 58 AT3g18660 X at4g33340 60 46

[0208] Table 2 shows the percentage identity between AT3 g18660 andother glycogenin genes from Arabidopsis using the programme BESTFIT ofthe GCG package. In each case, the full length nucleotide and peptidewas compared to the AT3g18660 gene.

[0209] These levels of identity are consistent with the genes encodingproteins with the same function. For example, the two yeast glycogeningenes are about 50% identical to one another at the protein level andare both known to be involved in the same pathway; both are essentialfor the production of glycogen and one can complement for the functionof the other.

[0210] It is interesting that the carboxyl terminal region of theprotein encoded by at1g77130 shows homology to a starch synthase (dull1)from maize. In yeast, glycogenin and glycogen synthase physicallyinteract. This finding may be the first indication that a similarscenario exists in plants. The at1g77130 gene appears to be aduplication of the AT3g18660 sequence, and the small region of homologywith dull1 may indicate that during the course of evolution this genehas become physically close to dull1. Recently published work (Yanai etal., 2001, Proc. Natl. Acad. Sci. USA 98(14): 7940-7945) suggests that afunctional association between two genes can be derived from theexistence of a fusion of the two as one continuous sequence in anothergenome. In yeast, it has been shown by experimentation that glycogeninand glycogen synthase physically interact and are associated together inan enzymatic complex to allow glycogen biosynthesis. The inventorsbelieve that PGSIP interacts with soluble starch synthases at the startof the starch biosynthesis process. This could be the first step in theformation of a biosynthetic starch enzymes complex where PGSIP acts as atemplate, starch synthases extend the chain followed by branching bystarch branching enzymes and other starch synthesis enzymes. It islikely that biosynthesis starch enzymes become associated with the veryfirst complex formed in the process of the synthesis of a starchpolymer.

[0211] The sequences of the six genes listed in Table 2 were compared byBLAST against the Arabidopsis sequences in an in-house database and afurther hit was obtained. The identified sequence corresponding toSPTREMBL: Q8W4AZ, EMBL: AY062695 encodes a protein of 618 amino acidsthat showed strong homology to the glycogenin genes (4e-26). Furtheranalysis of the sequence indicated that the protein represents the Cterminal domain of the At1g77130 gene (080649, T14N5.1) and is alsoannotated as At1g77130, T14N5.1 which encodes an unknown protein. Thissequence is set forth in SEQ ID NO: 23.

Example 2 Isolation of cDNA Encoding A. thaliana Glycogenin Homologue

[0212] Primers were designed to clone a full length cDNA representingthe accession number AB026654, gene_id:MVE11.2 (at3g18660 (MIPS)) froman Arabidopsis thaliana cDNA pool. Sequencing the full length cloneindicated that the gene encoded a protein of 659 amino-acids andconsists of five exons. The cDNA sequence designated as SEQ ID NO: 2.

[0213]Arabidopsis thaliana was grown in growth cabinets with a 16 hourslight and 8 hours dark period at a temperature of 22° C. during the dayand 17° C. during the night. A mixed cDNA sample was made with total RNAfrom 10 different tissues mixed together in equal amounts: root,dividing cell culture, young leaf, mature leaf, stem, seedling, seed,flower buds+flowers, drought 6 days- and drought 10 days-subjectedplants.

[0214] The primer used to make the first strand cDNA using SuperscriptII was from the original paper on PCR amplification by (Frohman et al.(1988) Proc. Natl. Acad. Sci. USA, 85:8998):5′GACTCGAGTCGACATCGATTTTTTTTTTTTTTTTT 3′.

[0215] 1 μl of this cDNA was used to amplify the cDNA clone representingthe accession number GTD:S:1870408 (gene id:MVE11.2) utilizing theprimers G1gfl and G1g int1 and C1aF and Glgstop2. G1gf1 primer:5′-GACCATGGCAAACTCTCCCGC-3′ G1g int1 primer: 5′-GCAGCATACTTTTCCAATTAC-3′C1aF primer: 5′-GCAAGTTCCGGCTATGGCAGC-3′ G1gstop2 primer:5′-GCGTCACAAGTTATGGCCGGG-3′

[0216] PCR Conditions:

[0217] Five 50 μl reaction was set up as follows: Composition PCRProgramme Water 35.5 μl   95° C.  2 min (hot start) 10x buffer 5 μl 95°C.  3 min 4 mMdNTPs 2.5 μl   55° C. 30 sec Pfu Turbo polymerase 1 μl 72°C.  2 min:30 sec 4 mM primers 5 μl 72° C. 10 min (extension) cDNA 1 μl

[0218] Two products were obtained. These were cloned in pBluescriptvector (SK−) (Stratagene) and a full length clone was obtained. The mapof this plasmid is shown in FIG. 1.

Example 3 Functional Analysis of the Arabidopsis cDNA

[0219] Yeast contains two glycogenin genes Glg1 (YKROS8w) and Glg2(YJL137c). Double mutants in the above genes do not make any glycogen(Cheng et al (1995) Mol. and Cell Biology 15(12):6632-6640). Mutantyeast strains from the EUROSCARF (European Saccharomyces CerevisiaeARchives For Functional Analysis) collection were obtained from SRDGmbH, D61440, Germany along with the wild type. Single mutants in theGlg1 and Glg2 genes were obtained in addition to the double mutant.Additionally a plasmid containing the entire Glg2 ORF including thepromoter was also obtained. This plasmid was used as a positive controlto establish a complementation assay. The description of the strainsare: Wild type ORF Accession no. Strain Genotype Y00000 BY4741 MATa;his3Δ1; leu2Δ0; met15Δ0; ura3Δ0

[0220] Single Mutants: ORF Accession no. Strain Genotype YKR058W Y15129G1G1 mutant BY4742; Mat alpha; his3Δ1; leu2Δ0; ura3Δ0; YKR058w::kanMX4YJL137c Y17003 g1g2 mutant BY4742; Mat a; his3 Δ1; leu2Δ0; ura3Δ0;YJL137c::kanMX4

[0221] Double Mutants: Mutant Strains Genotype 1. g1g1/g1g2 deletedBY4742; Mat alpha; his3Δ1; leu2Δ0; ura3Δ0; YKR058w::kanMX4;YJL137c::kanMX4 2. g1g1/g1g2 deleted BY4742; Mat a; his3Δ1; leu2Δ0;ura3Δ0; YKR058w::kanMX4; YJL137c::kanMX4

[0222] Plasmid Plasmid name Gene Marker PYCG_YJL137c(pRS416) G1g2ORF +prometer URA3

[0223] Glycogen Defect Assay

[0224] First, it was established that the wild type and the doublemutants were indeed different. For this experiment, freshly grown wildtype, and the double mutants were picked up from YPD plates and thecells were suspended in 100 μl of water in an eppendorf tube. To thistube approximately 100 μl of glass beads (Sigma) and 10-20 μl ofundiluted Lugol solution (Sigma) was added. The cells were vortexedbriefly, spun down for few seconds and assayed for color development.The wild type cells stained blue whereas the double mutants did notstain and appeared white.

[0225] Complementation Assay

[0226] Double mutants were transformed with the plasmid pRS416 and thetransformants were selected on CSM/Ura-plate (Uracil drop out plate). Asa negative control, double mutants were transformed without the plasmid.Many colonies were obtained in the positive plate but no colonies wereobtained from the negative control indicating that the transformationhad worked. The transformed double mutants were grown overnight inCSM/Ura-liquid media along with wild type and single mutants. Next dayOD₆₀₀ was checked to ensure equal amounts of cells in each of the tubes.Approximately equal amounts of cells were taken in an eppendorf tube andto this equal amounts of glass bead were added followed by 10-20 μl ofundiluted Lugol solution (Sigma). The cells were vortexed briefly andcentrifuged for few seconds and assayed for colour development.Complementation was observed in the double mutants as they appeared bluesimilar to the single g1g1 and g1g2 mutants.

[0227] Optimisation of the Assay to Distinguish Wildtype and MutantStrains

[0228] A small amount of the wildtype (WT) and glycogenin double mutant(Mut) yeast strains were picked up from a well-grown plate, resuspendedin 1 ml of water, and vortexed briefly. The cells were diluted furtherin 1 ml of water and 50 ul of the diluted cells were plated on YPDplates. The plate was incubated at 30° C. for two days and afterwardsthe plates were exposed to iodine vapour by inverting the plates on topof a 500 ml glass beaker containing iodine chips (Sigma) placed on a lowheater under a fume cupboard briefly for 2-3 minutes. Afterwards theplates were left open in the fume cupboard briefly for 1 minute and thecolour development was monitored. The WT cells stained brown and thedouble mutants (Mut) stained pale yellow.

[0229] Cloning PGSIP cDNA in into the pYES2 Vector for ComplementationStudies

[0230] Two constructs were made to do the experiment, one contained thefull length PGSIP cDNA including the transit peptide (TP) and another inwhich the transit peptide was removed (No transit peptide: NTP), thesewere cloned into pYes2 vector (Invitrogen). Primers were designed toamplify the full length PGSIP cDNA with the transit peptide (primers TPFand TPR) and without the transit peptide (primers NTPF and NTPR) so thatthese could be cloned into the pYes2 vector. A BamHI restriction enzymesite was incorporated into the forward primers (TPF and NTPR) and a XhoIrestriction enzyme site was incorporated into the reverse primers (TPRand NTPR). The NTP forward primer (NTPF) was designed in such a mannerso that it annealed at nucleotide position 190 of the full length PGSIPsequence and an ATG initiation codon was inserted after the BamHI siteto ensure that translation into protein could occur. This resulted in acDNA sequence lacking the first 63 amino acids of the PGSIP cDNAsequence which represents the transit peptide as predicted by the TargetP program (Emanuelsson et al, J. Mol. Biol. 300:1005-1016 (2000). Theprimer sequences were as follows: TPF 5′-GGATCCGACCATGGCAAACTCTCCCGC-3′TPR 5′-CTCGAGGCGTCACAAGTTATGGCCGGG-3′ NTPF5′-GGATCCATGTGTTGTTGTTTCACCAAG-3′ NTPR 5′-CTCGAGGCGTCACAAGTTATGGCCGGG-3′

[0231] A 50 ml PCR reaction was set up with Pfu polymerase (Stratagene)as follows: a coocktail solution was made with 35.5 ul water, 5 ul10×PCR buffer+, 2.5 ul solution (20 mM MgCl and 4 mM dNTPs), 1 ul Pfupolymerase, 5 ul 4 mM primers (TP/NTP), and 1 ul cDNA (1/100 dil). ThePCR thermocycler program consisted of a 95° C. 3 min (hot start),followed by 30 cycles of 95° C. for 30sec, 50° C. for 30sec, and 72° C.for 3 min. The final step in the program held the temperature at 24° C.

[0232] The amplified fragment was run out on an agarose gel, cut out andpurified using the ‘Geneclean kit’ according to the manufacturersinstructions (Bio101). The purified cDNA fragments were ligated intopBluescript vector (Stratagene) cut with EcoRV resttriction enzyme.Positive clones were identified and these were sequenced. Clones withthe correct sequences were then cut with the restriction enzymes BamHIand XhoI and ligated in pYes2 vector cut with the restriction enzymesBamHI and XhoI. Positive clones were identified and these were named,pTPYes (FIG. 2) and pNTPYes (FIG. 3). In these plasmids, the cDNA wasunder the control of the yeast Gal 1 promoter that is both glucoserepressible and galactose inducible.

[0233] Complementation Analysis with the Arabidopsis Glycogenin Gene

[0234] Yeast strains were transformed with the above plasmids followingthe method of Finley and Brent, 1995,(http://cmmg.biosci.wayne.edu/finlab/YTHprotocols.htm and links therein) in combination with the Clontech yeast transformation kit. From afreshly grown plate a 5 ml culture of yeast strain (WT and Mut) wasinoculated in YPD medium (Clontech) overnight with shaking at 30° C.Next day, 3 ml freshly grown cells were inoculated into 150 ml YPDmedium, (OD600=0.2) and grown shaking at 30° C. for 3-4 hours(OD600=0.7). 100 ml cells were then transferred to two 50 ml orange captubes and centrifuged at room temperature at 2000 rpm for 3 minutes. Thesupernatant was discarded completely. The cells were washed byresuspending them in 2.5 ml of sterile water followed by centrifugationas before. The supernatant was discarded and the cells were resuspendedby adding 625 ul of Lithium Acetate (LiAc)/TE (10mM Tris HCL pH 7.5, 1mM EDTA, 100 mM LiAc; made from a filter-sterile stock of IM LiAc, pH7.5) in each tube. The cells were centrifuged as before and thesupernatant was discarded. The cells were resuspended in 250 ml ofLiAc/TE then pooled into a single eppendorf tube giving 500 ml ofcompetent yeast cells. In an eppendorf tube the following was prepared,6 ml Herring Testis DNA (Clontech,10 mg/ml, boiled earlier for 10minutes and quenched on ice), 8 ml DNA [pYes2 empty plasmid, TPYes andNTP Yes DNA (˜2 ug)] and 6 ml of water making a total volume of 20 ml.In another tube 100 ml of competent yeast cells were added to which the20 ml mixture made above, plus 11 ml DMSO and 600 ul of 40% PEG 4000 inLiAc/TE (made from stocks of IM LiAc pH 7.5, filter sterile 50% PEG 4000in water, 1M Tris HCl pH 7.5 and 0.5M EDTA) was added. The tubes wereinverted three to four times gently and incubated at 30° C. for 30minutes. The tubes were inverted again gently and heat shocked at 42° C.for 20 minutes after which 50-100 ml was directly plated onCSM/Ura-/glucose plates. The plates were incubated for two to three daysat 30° C. Additionally, as a negative control, WT and Mut yeast strainswere transformed with the empty pYes2 plasmid. As a positive control theMut strains were transformed with the yeast GLG2 gene (plasmid pRS416)purchased from EUROSCARF. The transformed cells were selected onCSM/Ura-glucose drop out plates. After two days the cells were pickedindividually into patches and streaked onto glucose and galactoseplates. In the end, we had the following plates.(Table 4) TABLE 4 NameGlucose Galactose 1. WT:pYes2 control Yes 2. Mut:pYes2 control Yes Yes3. WT:NTP Yes Yes 4. Mut:NTP Yes Yes 5. WT:TP Yes Yes 6. Mut:TP Yes Yes7. Mut:yeast GLG2 gene + Yes Yes ve control

[0235] Yeast strains used for the complementation experiment (Table 5)TABLE 5 Name 1. WT:pYes2 control 2. Mut:pYes2 control 3. Mut:TP 4.Mut:NTP 5. Mut:yeastGLG2

[0236] The plates listed in Table 4 and Table 5 were grown for two daysat 30° C. as described above. The cells were diluted and plated on toboth CSM/Ura-glucose and CSM/Ura-galactose plates. After two days ofgrowth at 30° C. the cells were exposed to iodine vapour as describedabove and photographs were taken. From the photographs, it was confirmedthat the assay worked as the Mut strains containing the yeast GLG2 gene(no.7 from the table 4) stained brown both in the glucose and galactoseplates. The WT strain (no.1 from the table 4) stained brown whereas theMut strains (no. 2 from the table 4) containing the empty plasmidstained yellow. The cells containing the NTP plasmid (no. 4 from thetable 4) stained yellow in glucose plate but it stained brown ingalactose plates but the brown colour is not as intense as observed inMut strains containing the yeast GLG2 gene indicating that thecomplementation is partial. This data indicates that the PGSIP cDNA is afunctional orthologue of the yeast glycogenin gene and plays a role instarch biosynthesis especially in plants and particularly inArabidopsis. The cells containing the TP plasmid (no. 3 from the table4) stains yellow in glucose and galactose plates indicating thatcomplementation was not achieved with this plasmid. In general,validating the function of plant genes by yeast complementation has beenreported (Alderson et al, Proc Natl. Acad.Sci. USA, 88:8602-8605 (1991),Vogel et al., Plant J, 13 (5):673-683, 1998, Blazquez, et al., Plant J,13 (5):685-689, 1998.

Example 4 cDNA Isolation from Maize Endosperm

[0237] Maize EST Identification

[0238] ESTs encoding corn glycogenin gene were identified using theprogram BLAST (Basic Local Alignment Search Tool; Altschul, S. F. et al(1990) J. Mol. Biol. 215:403-410, see also www.ncbi.nlm.nih.gov/BLAST/).A database search using the Arabidopsis gene AT3g18660 and at1g77130against the maize database at NCBI identified accession no. GB: BF729544and GB: BG837930 which showed significant similarity to the Arabidopsisglycogenin genes. The sequence of the two ESTs is shown in SEQ ID NO: 4,and SEQ ID NO: 5 respectively. A blastx analysis of the two ESTs againstSPTREMBL database showed that EST BF729544 picked up the first hit tothe AT3g18660 gene whereas EST BG837930 showed first hit to theat1g77130 gene. Protein alignments of these ESTs indicated that bothESTs were partial and they showed 85-86% identity to the above twoArabidopsis genes. Moreover, for EST BF729544 the identity was confinedto the central portion of the AT3g18669 protein starting at amino-acidposition 245 and ending at position 427, whereas for EST BG837930 theidentity started at amino-acid position 391 and extending until position632. A bestfit analysis between the two nucleotide sequences of the ESTsand the AT3g18660 gene showed that the two ESTs have 68-69% identity. Abestfit analysis between the two EST DNA sequences showed that there wasa high degree of homology between the two ESTs. From the above analysis,it appears that EST BF729544 is the homolog of the Arabidopsis AT3g18660gene, whereas EST BG837930 is a homolog of the Arabidopsis AT1g77130.

[0239] A database search using the Arabidopsis genes AT3g18660 andatg177130, against the maize database in-house identified fouradditional sequences which showed significant similarity to theArabidopsis glycogenin genes. The four nucleotide sequences called MaizeSEQ 1, Maize SEQ 2, Maize SEQ 3 and Maize SEQ 4 are shown in SEQ ID NOs:27, 29, 31 and 33 and the deduced amino acid sequences for thesenucleotide sequences are shown in SEQ ID NOs: 28, 30, 32 and 34.

[0240] Culture Conditions

[0241] Maize was grown in the greenhouse with a 16 hour daylight and 8hour night period with a temperature of 24° C. during the day and 18° C.during the night. Seeds were harvested at different stages between 3 and35 days after pollination (DAP). Young and medium leaves were alsoharvested.

[0242] Establishment of Copy Number and Identification of GlycogeninHomolog in Maize, Wheat and Arabidopsis

[0243] Genomic DNA was isolated from Arabidopsis, wheat and maize leavesaccording to the method of Davies et al., ((1994) Methods in MolecularBiology vol. 28: Protocols for nucleic acid analysis by non-radioactiveprobes, Isaac P. G. (ad) pp 9-15 Humana press, Totowa, N.J. USA). DNAwas digested with restriction enzyme, EcoRI, XhoI and EcoRV and thedigested DNA was run overnight at 20V in 1% agarose gels. The DNA wasthen transferred to a nylon membrane by vacuum blotting and twoidentical southern blots were prepared and each one was probed first ata high stringency and later at low stringency conditions. One blot wasprobed with a digioxygenin labelled AT3g18660 cDNA probe encoding theN-terminus of the gene (a 1.8 kb NcoI-AvaI fragment) and filter 2 wasprobed with AT3g18660 cDNA probe (PGSIP) encoding the C-terminus of thegene (a 700 bp C1a K fragment), FIG. 1. Hybridisation was done at 65° C.and the blots were first washed with 2×5 minutes with 2×SSC, 0.1×SDS andlater with 0.1×SSC and 0.1×SDS at 65° C. (high stringency washes).Strong single bands of the expected sizes (5.9 kb in the XhoI cut DNA,4.6 kb in the EcoRI cut DNA and 50.1 kb in the EcoRV cut DNA) wereobserved only in the lanes containing Arabidopsis DNA. No band wasobserved in the lanes containing maize and wheat DNA, as shown in FIG.6. Later the blots were stripped and these were re-probed at 55° C. andwashed at 60° C. for 2×15 minutes with 2×SSC, 0.5% SDS (low stringencywashes). No band was observed in the lanes containing maize and wheatDNA. Later the blots were stripped and these were reprobed at 55° C. andwashed at 60° C. for 2×15 minutes with 2×SSC, 0.5% SDS (low stringencywashes). Three bands were observed in the lane containing XhoI digestedArabidopsis DNA, two-three bands were observed in the lanes containingmaize and wheat DNA, as shown in FIG. 7. From the genomic sequence ofthe AT3 g18660 gene it was known that it spanned two Xho I, EcoR1 andEcoRV sites. This demonstrated that PGSIP exists as a gene familycomprising of about 2-3 genes in Arabidopsis, maize and wheat.

[0244] RNA Extraction and First Strand cDNA Synthesis

[0245] Total RNA was extracted from the tissues described above usingthe method of Napoli et al (1990), Plant Cell, 2, 279-289 and in somecases using Qiagen RNA extraction kit following manufacturer s protocol.First strand cDNA was made using SuperscriptII everse transcriptase(GIBCO-BRL) and oligo dT primer as described in (Frohman et al, 1988),Proc. Natl. Acad. Sci. USA, 85:8998):

[0246] 5′ GACTCGAGTCGACATCGATTTTTTTTTTTTTTTTT 3′.

[0247] This cDNA pool was used to amplify a maize cDNA homolog to theArabidopsis lycogenin gene (AT3g18660 and at1g77130) utilising thesequence information from the STs, GB:BF729544 and GB: BG837930described above.

[0248] EST BF729544 and BG837930 overlapped and these were combined todeduce a ingle maize PGSIP sequence. Primers were designed to amplify amaize cDNA clone orresponding to this sequence. Primer sequences were asfollows. [GigmaF] 5′-GGCAATAGAGGAATTCATGTGC-3′ [GlgmaR]5′-CGTGCAGAACTCGGACCACAG-3′

[0249] Construction of a Maize cDNA Library

[0250] Total RNA was extracted from the various tissues described above(leaves and seeds ranging from 3-35 DAP). The RNA obtained was mixed inequal amounts. This RNA mixture was then used to make a maize cDNAlibrary using SMART cDNA library construction kit (Clontech) followingmanufacturer's instruction.

[0251] Cloning of Maize cDNA

[0252] 1 ul of this first strand cDNA obtained above was used to amplifythe cDNA clone represented by the ESTs by PCR using the primers GlgmaFand GlgmaR, the PCR product obtained was cloned into EcoRV cutpBlueScript (SK−) and positive clones were identified. These positiveclones were sequenced to confirm that the product obtained indeedrepresented the sequence in the EST accession number, BF729544. Thisproduct was then used to screen the cDNA library and a full length clonewas obtained. Similarly a cDNA clone represented by the EST accessionno. BG837930 was also cloned.

[0253] The PCR conditions were the same as described before for cloningthe Arabidopsis gene (AT3gl 8660) of SEQ ID NO: 2.

Example 5 cDNA Isolation from Wheat Endosperm

[0254] A database search using the Arabidopsis genes AT3g18660 andat1g77130, against the wheat in-house database identified one sequence,which showed significant similarity to the Arabidopsis PGSIP genes(e-137). The sequence called Wheat SEQ1 is shown in SEQ ID NO: 20.

[0255] Culture Conditions

[0256] Wheat variety NB1 (described in patent WO 00/63398) was grown inthe glass house with a 16 hour daylight and 8 hour night period with 22°C. during the day and 15° C. during the night. Seeds were harvested atdifferent stages between 5 and 20 days after pollination (DAP). Youngand medium leaves were also harvested.

[0257] RNA Extraction and First Strand cDNA Synthesis

[0258] Total RNA was extracted from the above tissues using the methodof Napoli et al (1990) and in some cases using Qiagen RNA extraction kitfollowing manufacturer's protocol. First strand cDNA was made usingSuperscriptII reverse transcriptase (GIBCO-BRL) and oligo dT primer asdescribed in (Frohman et al, (1988), Proc. Natl. Acad. Sci. USA,85:8998. This cDNA pool was used to amplify a wheat cDNA homolog to theArabidopsis glycogenin gene (AT3g18660 and at1g77130) utilising thesequence information from the maize ESTs, NCBI accession no. BF729544and BG837930 described above.

[0259] Wheat cDNA Library Making

[0260] Total RNA was extracted from the various tissues described above(leaves and seeds ranging from 7-30 days post anthesis (DPA). The RNAobtained was mixed in equal amounts. This RNA mixture was then used tomake a wheat cDNA library using SMART cDNA library construction kit(Clontech). Additionally a genomic library from Triticum tauschii, varstrangulata, accession number CPI 110799, described in (Rahman et al.,1997, Genome, 40:465-474) was also used in this study. The cDNA libraryfrom Wheat cv Wyuna described in (Li et al., 1999, Theor. Appl. Gen.98:226-233) was also used in this study.

[0261] Cloning of Wheat cDNA

[0262] Because a strong band was observed on southern blots probed withthe Arabidopsis gene (AT3g18660), it was assumed that there issignificant degree of homology between the Arabidopsis, maize and wheatDNA sequences. A comparison of the Arabidopsis and the maize ESTsequences also suggested that this was the case. A wheat cDNA librarywas screened with probes made from the maize and the Arabidopsisglycogenin gene. A full length clone was obtained by restriction mappingand analysing the sequence of a number of positive clones.

[0263] PCR Conditions

[0264] The PCR conditions were the same as described before for cloningthe Arabidopsis gene (AT3g18660).

Example 6 Agrobacterium Constructs

[0265] Construct Making

[0266] The pSB111 Sulugi described in patent publication WO 00/63398 wasused. Six different constructs were made, one each for maize, wheat andArabidopsis in sense orientation and one each for maize, wheat andArabidopsis in antisense orientation for constitutive expression.Another six set of constructs, were also made using seed specificpromoters.

[0267] Two constructs were made, one for overexpression and another fordownregulation of the Atglycogenin gene. For overexpression, theAtglycogenin gene was excised out from the plasmid (At3g18660 (PGSIP),FIG. 1) with SalI-EcoRI digest and ligated in SalI-EcoRI cut pJIT65resulting in plasmid pCL68. This plasmid was then digested withEcoRI-XhoI and the fragment was ligated into SalI-Smal cut Nos-NptII SCVresulting in plasmid pCL68 SCV. In this plasmid the Atglycogenin isunder 2×35S promoter for constitutive expression.

[0268] For RNAi construct, first a fragment representing the 3′ end ofthe Atglycogenin gene was amplified by PCR using ClaF and Glgstop2primer (see example 2) and was cloned into pBluescript. The resultingconstruct was designated pMC167. Clones in both orientation wereobtained and the clone with the fragment in reverse orientation wascalled pMC167inv. pMC167inv was cut with EcoRV-SmaI and ligated backresulting in plasmid pMC167del. pMC167del was cut with HindII-BamHI andligated into HindIII-BamHI cut pT7blue2 resulting in plasmid“GlycoinpT7Blue2” (pCL66). Another plasmid (called GlycogeninIRstep 1,pCL67) was created by cutting pMC 167inv with XhoI-EcoRV and ligatingthis fragment into XhoI-EcoRV cut pWP446A containing the AtSac25intron1. Finally, plasmid “GlycoinpT7Blue2”, pCL66 was cut withBamHI-SstI and the fragment ligated into BamHI-SstI cut“GlycogeninIRstepl”, pCL67 resulting in plasmid pCL69. pCL69 was cutwith EcoRI-XhoI and the fragment was ligated in SCV Nos-NptII at theSmaI-SalI site resulting in plasmid pCL76 SCV. In this plasmid the Atglycogenin (PGSIP) RNAi is under 2×35S promoter for constitutiveexpression.

[0269]FIG. 6 summarises the whole process and the maps of these plasmidsare shown in FIGS. 9 and 10. The plasmids were transformed into theGV3101 Agrobacterium strain and the Arabidopsis plants were transformed.

Example 7 Transformation of Wheat

[0270] Wheat plants transformed with the constructs of Example 6 wereproduced by the seed inoculation method described in patent publicationWO 00/63398. Solanum tuberosum c.v. Prairie was transformed with pCL68SCV and pCL76 SCV using the method of leaf disk cocultivationessentially as described by Horsch et al. (Science 227: 1229-1231,1985). The youngest two fully-expanded leaves from a 5-6 week old soilgrown potato plant were excised and surface sterilized by immersing theleaves in 8% ‘Domestos’ for 10 minutes. The leaves were then rinsed fourtimes in sterile distilled water. Discs were cut from along the lateralvein of the leaves using a No.6 cork borer. The discs were placed in asuspension ofAgrobacterium tumefaciens strain LBA4404 containing one ofthe two plasmids listed above for approximately 2 minutes. The leafdiscs were removed from the suspension, blotted dry and placed on petridishes (10 leaf discs/plate) containing callusing medium (Murashige andSkoog agar containing 2.5 μg/ml BAP, 1 μg/ml dimethylaminopurine, 3%(w/v) glucose). After 2 days the discs were transferred onto callusingmedium containing 500 μg/ml Claforan and 50 μg/ml Kanamycin. After afurther 7 days the discs were transferred (5 leaf discs/plate) to shootregeneration medium consisting of Murashige and Skoog agar containing2.5 μg/ml BAP, 10 μg/ml GA3, 500 μg/ml Claforan, 50 μg/ml Kanamycin and3% (w/v) glucose. The discs were transferred to fresh shoot regenerationmedia every 14 days until shoots appeared. The callus and shoots wereexcised and placed in liquid Murashige and Skoog medium containing 500μg/ml Claforan and 3% (w/v) glucose. Rooted plants were weaned into soiland grown up under greenhouse conditions to provide tuber material foranalysis.

[0271] Alternatively, microtubers were produced bytaking nodal piecesoftissue culture grown plants onto Murashige and Skoog agar containing2.5 μg/ml Kanamycin and 6% (w/v) sucrose. These were placed in the darkat 19° C. for 4-6 weeks when microtubers were produced in the leafaxils.

Example 8 Transformation of Maize

[0272] Maize plants transformed with the constructs of Example 6 wereproduced by the seed inoculation method described in patent publicationWO 00/63398.

Example 9 Transformation of Potato

[0273] Transgenic potato plants expressing the Arabidopsis plantglycogenin-like gene in sense and antisense orientation were produced.

Example 10 Characterisation of the Transgenic Lines

[0274] Transgenic plants were analysed by the following methods Forsense constructs, 20 T1 lines were analysed; for antisense constructs,50 T1 lines were analysed. Plants transformed with sense and antisensesequences of the invention were observed to have altered starchsynthesizing ability which was linked to the expression of thetransgene.

[0275] For the maize, wheat, and potato lines examined, severaltechniques of analysis were employed. PCR-positive line identification,northern-RNA expression, southern-copynumber detection, western-proteinexpression, amylogenin activity, starch structure and quality, andphenotype all confirmed the successful transformation of the maize,wheat, and potato.

Example 11 cDNA Isolation from Rice

[0276] The six genes listed in Table 2 were blasted against the ricesequences collected in an in-house database and one new hit wasobtained. The accession corresponded to SPTREMBL:Q94HG3, EMBL:AC079633(SEQ. ID NO; 25) which encodes a protein of 614 AA and shows stronghomology to the PGSIP gene (e-129).

Example 12 Arabidopsis Transformation

[0277]Arabidopsis thaliana c.v. Columbia plants were transformedaccording to the method of Clough and Brent 1998 Plant J. 16(6):735-743(1998) with slight modification. Plants were grown to a stage at whichbolts were just emerging. Phytagar 0.1% was added to the seeds and thesewere vernalized overnight at 4° C. We used 10-15 seeds per 3×5 inchpots. Seed was added onto the soil with a pipette, about 4-5 seeds perml was dispersed. Seeds were germinated as usual (ie under humidity potswere covered until first leaves appeared and then over a two day periodthe lid was cracked and then removed). Plants were grown for about 4weeks in the greenhouse (long day condition) until bolts emerged. Thefirst bolts were cut to encourage growth of multiple secondary bolts.Bolts containing many unopened flower buds were chosen for dipping.

[0278] Growing the Agrobacterium Culture

[0279] Aliquots of the Agrobacterium strain GV3101 carrying theconstructs pCL68 SCV and CL6876 SCV were grown first as a 5 ml culturein YEP containing Gentamycin (15 ug/ml) and anamycin 20 ug/ml. Next day,2 ml freshly grown culture was added to 400 ml YEP media (10 g YeastExtract, 10 g peptone, 5 g NaCl, pH 7.0) in a 2 litre flask and theflask was incubated at 28° C. incubator with shaking overnight. Next dayOD 600 of the cells was measured and found to be 1.8. Cells were dividedinto 2× Oakridge bottles and harvested by centrifugation at 500Orpm for10 min in a GSA rotor at room temperature The pellet was resuspended in3 volumes of infiltration media so that the final concentration of theculture was 0.6. Infiltration media was prepared by adding thefollowing. ½ Murashige and Skoog Salts, 1× Gamborg's Vitamins and 0.44uM Benzylamino Purine (10 ul per L of a 1 mg/ml stock), pH was adjustedto 5.7 with NaOH. Then 0.02% Silwet (200 ul per 1 L) was added and mixedinto the solution.

[0280] Arabidopsis Transformation by Dipping

[0281] 500 ml of resuspended Agrobacterium was poured into a tray andplants were inverted into Agrobacterium solution in batches of 10 for 15minutes. After 15 minutes the plants were lifted and the excess solutiondrained, The plants were transferred on their sides to a fresh traycontaining tissue paper to allow further soaking of the solution andthen transferred to propagating trays. The plants were immediatelycovered with lids to maintain humidity. After two days the lid wasremoved and the plants allowed to grow normally. They were not wateredfor one week until the soil looked dry. After flowereing was completeand the siliques on the plants were dry, all the seeds from one pot wereharvested. The seeds were completely dried by keeping harvested seed inan envelope for one week

Example 13 Selection of Transformed Arabidopsis thaliana Seed

[0282] Seed produced from transformed Arabidopsis thaliana c.v. Columbiaplants was weighed into 10 mg aliquots, equivalent to about 500individual seed, and placed into a sterile 15 ml tube. The seed wassurface sterilised by treating with 10 ml of Teepol bleach/Tween 20solution (500 ml of 50% (v/v) Teepol bleach containing 1 drop of Tween20) for five minutes. The seeds were then washed four times with 10 mlTween 20 in sterile water (1 drop Tween 20 in 500 ml sterile water). Theseeds were then suspended in 5 ml sterile water and 5 ml warm 0.5% agar,mixed carefully and then half of the seeds were spread over one petridish containing half strength Murashige and Skoog agar medium and theother half over a second dish containing half strength Murashige andSkoog agar medium plus 50 μg/ml kanamycin. The plates were sealed andincubated at 4° C. for 48hours. The plates were then transferred to agrowth room under low light (2000 lux). Seed on both types of plategerminated but on the plates containing kanamycin non-resistant plantsbleached and died within 7 days. FIG. 8 demonstrates this selection ofkanamycin resistant seedlings. After 14 days the resistant plants weretransferred from the selective medium onto MS medium for a further 10days before being transferred into soil. The plants were grown on toproduce leaf material for further analysis.

Example 14 Analysis of Arabidopsis thaliana Plants Transformed withpCL68 SCV for the Presence of the PGSIP Construct

[0283] For the pCL68 SCV transformed lines a total of 31 kanamycinresistant plants were obtained from four of the original floral dips.These were tested for the presence of the construct by PCR.

[0284] Genomic DNA Extraction

[0285] Leafmaterial was taken from regenerated Arabidopsis thalianaplants transformed with pCL68 SCV and genomic DNA isolated. One leaf wasexcised from a plant growing in soil and placed in a 1.5 ml eppendorftube. The tissue was homogenised using a micropestle and 400 μlextraction buffer (200 mM Tris HCL pH 8.0; 250 mM NaCl; 25 mM EDTA; 0.5%SDS) was added and ground again carefully to ensure thorough mixing.Samples were vortex mixed for approximately 5 seconds and thencentrifuged at 10,000 rpm for 5 minutes. A 350 μl aliquot of theresulting supernatant was placed in a fresh eppendorf tube and 350 μlchloroform was added. After mixing, the sample was allowed to stand for5 minutes. This was then centrifuged at 10,000 rpm for 5 minutes. A 300μl aliquot of the supernatant was removed into a fresh eppendorf tube.To this was added 300 μl of propan-2-ol and mixed by inverting theeppendorf several times. The sample was allowed to stand for 10 minutes.The precipitated DNA was collected by centrifuging at 10,000 rpm for 10minutes. The supernatant was discarded and the pellet air dried. Thepellet of DNA was resuspended in 50 μl of distilled water and was usedas a template in PCR.

[0286] PCR Detection of PGSIP

[0287] A pair of optimised oligonucleotide primers were designed andsynthesised to enable the detection of the pCL68 SCV construct intransformed plants. The sequences of these primers were: ATGLY002:CGTCTCGTGTCTGGTTTATATTCA ATGLY003: TCGATGCCTGAGATCTCAGCT

[0288] PCR mixtures which contained 5 μl 10× Advantage Taq buffer; 5 μl2 mM dNTPs; 0.5 μl of primer ATGLY002 (100 μM); 0.5 μl of primerATGLY003 (100 μM); 5 μl DNA template Arabidopsis thaliana genomic DNA orcontrol pCL68 SCV plasmid DNA); 0.25 μl Advantage aq polymerase; 33.75pi distilled water in a final volume of 50 μl were set up. The PCR wascarried out on a thermocycler using the following parameters: first ahot start at 94° C. for 5 min, then 25 cycles consistingof94° C. for 15sec, 55° C. for 30 sec, and 72° C. for 3 min. The cycles were followedby 72° C. for 5 min and a final step of holding the samples at 8° C.

[0289] A diagnostic DNA fragment of 977 bp was produced in thesereactions. Results of the PCR are shown in FIG. 11.

[0290] The PCR results for pCL68 SCV transformed plants indicated thatof the 30 of the 31 of the plants examined had successfully beentransformed. Thus, all of the plants except for the plant labeled 1-005contained the PGSIP gene.

Example 15 Analysis of Arabidopsis thaliana Plants Transformed withpCL76 SCV for the Presence of the PGSIP Downregulation Construct

[0291] For the pCL76 SCV transformed lines a total of 10 kanamycinresistant plants were obtained. Leaf material was taken from regeneratedArabidopsis thaliana plants transformed with pCL76 and genomic DNAisolated. One leaf was excised from a plant growing in soil and placedin a 1.5 ml eppendorf tube. The tissue was homogenised using amicropestle and 400 μl extraction buffer (200 mM Tris HCL pH 8.0; 250 mMNaCl; 25 mM EDTA; 0.5% SDS) was added and ground again carefully toensure thorough mixing. Samples were vortex mixed for approximately 5seconds and then centrifuged at 10,000 rpm for 5 minutes. A 350 μlaliquot of the resulting supernatant was placed in a fresh eppendorftube and 350 μl chloroform was added. After mixing, the sample wasallowed to stand for 5 minutes. This was then centrifuged at 10,000 rpmfor 5 minutes. A 300 μl aliquot of the supernatant was removed into afresh eppendorf tube. To this was added 300 μl of propan-2-ol and mixedby inverting the eppendorf several times. The sample was allowed tostand for 10 minutes. The precipitated DNA was collected by centrifugingat 10,000 rpm for 10 minutes. The supernatant was discarded and thepellet air dried. The pellet of DNA was resuspended in 50 μl ofdistilled water and was used as a template in PCR.

[0292] PCR Detection of PGSIP RNAi DNA

[0293] A pair of optimised oligonucleotide primers were designed andsynthesised to enable the detection of the pCL76 SCV construct intransformed plants. The sequences of these primers were: ATGLY001:TTTGAACAAACAAAAAGGTGGAAC ATGLY002: CGTCTCGTGTCTGGTTTATATTCA

[0294] PCR mixtures which contained 5 ml 10× Advantage Taq buffer; 5 ml2 mM dNTPs; 0.5 ml of primer ATGLY001 (100 mM); 0.5 ml of primerATGLY002 (100 mM); 5 ml DNA template (Arabidopsis thaliana genomic DNAor control pCL76 SCV plasmid DNA); 0.25 ml Advantage Taq polymerase;33.75 ml distilled water in a final volume of 50 ml were set up. The PCRwas carried out on a thermocycler using the following parameters: firsta hot start at 94° C. for 5 min, then 25 cycles of 94° C. for 15 sec,55° C. for 30 sec, and 72° C. for 3 min. The cycles are followed by 72°C. for 5 min and the samples are then held at 8° C.

[0295] A diagnostic DNA fragment of 819 bp was produced in thesereactions. See FIG. 13. Out of 8 kanamycin resistant plants tested, 2were shown to contain the PGSIP RNAi gene construct.

Example 16 Constitutive Overexpression and Downregulation of PGSIP Genein Barley

[0296] Starch is made in the leaves and the grain. To test the effect ofoverexpressing and downregulating the PGSIP gene in a monocot species,plasmids pCL68 SCV (sense construct) and pCL76 SCV (RNAi construct) wereexpressed in barley. These plasmids conferred constitutive expression asthe genes were under the control of the double 35S promoter.Additionally, the full length gene and the RNAi cassette were expressedunder the control of the rice actin promoter (U.S. Pat. No. 5,614,1876).For this purpose, the Gateway cloning technology was used according tomanufacturers instruction with slight modification (Invitrogen). Thefull length PGSIP was excised from plasmid pMC168 with NcoI-EcoRI andcloned into pENTR4 vector cut with NcoI-EcoRI resulting in plasmidcalled pMC175. The RNAi cassette was excised from plasmid pCL76 SCV withSalI-EcoICR1 and cloned into pENTR1 vector cut with SalI-EcoRV resultingin plasmid pMC174. These plasmids were then recombined with Destinationvector pWP492R12 SCV that contained the actin promoter flanked by tworecombination sites (attR1 and attR2 on either side (Invitrogen). Thisresulted in plasmids pMC 177 and pMC 176 respectively which containedthe PGSIP gene and the RNAi construct under the control of the riceactin promoter (U.S. Pat. No. 5,614,1876). These plasmids are shown inFIGS. 14 and 15.

[0297] The constructs were transformed into Agrobacterium strain (AGL-1)(Lazo et al., 1991, Bio/Technol 9: 963-967) for barley transformation.Immature embryos of the barley variety Golden Promise were transformedessentially according to the method of Tingay et al. (The Plant Journal11(6): 1369-1376, 1997). Donor plants of Golden Promise were grown withan 18 hours day, and 18/13° C. Immature embryos (1.5-2.0 mm) wereisolated and the axes removed. They were then dipped into an overnightliquid culture of Agrobacterium, blotted and transferred toco-cultivation medium. After 2 days the embryos were transferred to MSbased callus induction medium with Asulam and Timentin for 10 days.Tissues were transferred at 2 weekly intervals, and at each transferthey were cut into small pieces and lined out on the plate. At the thirdtransfer, only the embryogenic tissue was moved on to fresh medium.After a total of 8 weeks in culture, the tissue was transferred toregeneration medium (FHG), where plantlets formed within 2-4 weeks.These were transferred to Beatsons glass jar with growth regulator freemedium until roots had formed, when they were transferred to Jiffiesexpandable peat pellets and then to the Conviron growth chamber.

[0298] The plants were analysed by PCR using following primers.

[0299] For plants containing pCL68 plasmid (sense expression)5-′ATTTGGAGAGGACAGCCCAAGC Glyc For 5′-CTCCATCGTTGGATCTCGTTCG-3′ Glyc Rev(S)

[0300] For plants containing pCL76 plasmid (RNAi expression)5′-ATTTGGAGAGGACAGCCCAAGC-3′ Glyc For 5′-GCGTCATCTTCATCGCCAATCC-3′ GlycRev (D)

[0301] PCR was carried out as described in above

[0302] Results:

[0303] Six barley plants were regenerated after transformation withplasmid pCL68 SCV and eight plants with plasmid pCL76 SCV. The plantswere first analysed by PCR and the leaves of the positive plants weresubjected to iodine staining by Lugol. The results of PCR analysis arepresented in Table 7. TABLE 7 results of PCR screen of barley plantstransformed with pCL68 SCV or pCL76 SCV. Construct Plant no PCR no. PCRControl1 GG11 Neg Control2 GG12 Neg Control3 GG13 Neg pCL68 1 GG1 PospCL68 2 GG2 Neg pCL68 3 pCL68 4.1 GG8 Neg pCL68 5.1 pCL68 6.1 GG3 NegpCL68 6.2 pCL68 6.3 GG9 Neg pCL68 7.1 GG10 Neg pCL76 1.1 GG4 Pos pCL761.2 GG5 Pos pCL76 1.3 GG6 Pos pCL76 1.4 GG14 Pos pCL76 1.5 GG15 NegpCL76 2 GG7 Neg pCL76 3.1 GG16 Pos pCL76 4.1 GG17 Neg

[0304] One plant containing the sense construct was found to containmore starch granules in its leaves relative to control plants withoutthe sense construct. The plants containing the RNAi construct were foundto lack starch granules as shown in FIG. 17.

Example 17 Seed Specific Overexpression and Downregulation of the PGSIPGene in Barley

[0305] For seed specific expression, the plasmids pMC174 and pMC175 wererecombined with the plasmid pWP491R12SCV that contained the seedspecific promoter flanked by two recombination sites (attR1 and attR2 oneither side (Invitrogen)). Barley plants were transformed according tothe method of Tingay et al. (1997) with some modification as describedfor Example 13.

Example 18 Analysis of Transformed Solanum tuberosum Plants for Presenceof the PGSIP Construct

[0306] Analysis of Regenerated Potato transformants.

[0307] Leaf material was taken from regenerated potato plants andgenomic DNA isolated.

[0308] One large potato leaf (approximately 30 mg) was excised from anin vitro grown plant and placed in a 1.5 ml eppendorf tube. The tissuewas homogenised using a micropestle and 400 μl extraction buffer (200 mMTris HCL pH 8.0; 250 mM NaCl; 25 mM EDTA; 0.5% SDS) was added and groundagain carefully to ensure thorough mixing. Samples were vortex mixed forapproximately 5 seconds and then centrifuged at 10,000 rpm for 5minutes. A 350 μl aliquot of the resulting supernatant was placed in afresh eppendorf tube and 350 μl chloroform was added. After mixing, thesample was allowed to stand for 5 minutes. This was then centrifuged at10,000 rpm for 5 minutes. A 300 μl aliquot of the supernatant wasremoved into a fresh eppendorf tube. To this was added 300 μl ofpropan-2-ol and mixed by inverting the eppendorf several times. Thesample was allowed to stand for 10 minutes. The precipitated DNA wascollected by centrifuging at 10,000 rpm for 10 minutes. The supernatantwas discarded and the pellet air dried. The pellet of DNA wasresuspended in 50 μl of distilled water and was used as a template inPCR.

[0309] PCR mixtures which contained 5 μl 10× Advantage Taq buffer; 5 μl2 mM dNTPs; 0.5 μl of either primer ATGLY001 or ATGLY003 (100 μM); 0.5μl of primer ATGLY002 (100 μM); 5 μl DNA template (Solanum tuberosumc.v. Prairie genomic DNA, control pCL68 SCV plasmid DNA or control pCL76SCV plasmid DNA); 0.25 μl Advantage Taq polymerase; 33.75 μl distilledwater in a final volume of 50 μl were set up. The PCR was carried out ona thermocycler using the following parameters: first a hot start at 94°C. for 5 min, followed by 25 cycles of 94° C. for 15 sec, 55° C. for 30sec, and 72° C. for 3 min. The cycles were followed by 72° C. for 5 minand a finally holding the temperature at 8° C.

[0310] A diagnostic DNA fragment of 977 bp was produced in thesereactions from plasmid pCL68 SCV or 819 bp from plasmid pCL76 SCV. Linesof Solanum tuberosum c.v. Prairie transformed with pCL68 SCV or pCL76SCV were tested by PCR and were shown to contain the construct.

[0311]FIG. 16 shows the results for pCL68 SCV and FIG. 17 shows theresults for pCL76 SCV. Of 18 plants transformed with pCL68 SCV, all 18contained the sense PGSIP construct. For the PGSIP RNAi construct (pCL76SCV), 3 out of 8 plants contained the construct.

Example 19 Analysis of Transformed Plants for PGSIP Expression

[0312] Raising Antisera to PGSIP Proteins.

[0313] Expression of PGSIP proteins can be analysed by Western blotting.Antibodies to PGSIP are raised by inoculating rabbits with peptidescorresponding to the Arabidopsis thaliana PGSIP protein sequencesproduced by expressing the sequence as a transcriptional fusion withglutathione-S-transferase in E. coli cells

[0314] Preparation of Protein Extracts.

[0315] Protein extracts from potato tuber were produced by taking up to100 mg of tissue and homogenising in 1 ml of ice cold extraction bufferconsisting of 50 mM HEPES pH 7.5, 10 mM EDTA, 10 mM DTT. Additionally,protease inhibitors, such as PMSF or pepstatin were included to limitthe rate of protein degradation. The extract was centrifuged at 13000rpm for 1 minute and the supernatant decanted into a fresh eppendorftube and stored on ice. The supernatants was assayed for soluble proteincontent using, for example, the BioRad dye-binding protein assay(Bradford, M. C. (1976) Anal. Biochem. 72, 248-254).

[0316] An aliquot of the soluble protein sample, containing between10-50%g total protein was placed in an eppendorf tube and excess acetone(ca 1.5 ml) added to precipitate the proteins which were collected bycentrifuging the sample at 13000 rpm for 5 minutes. The acetone wasdecanted and the samples air-dried until all the residual acetone hasevaporated.

[0317] SDS-Polyacrylamide Gel Electrophoresis.

[0318] The protein samples were separated by SDS-PAGE. SDS PAGE loadingbuffer (2% (w/v) SDS; 12% (w/v) glycerol; 50 mM Tris-HCl pH 8.5; 5 mMDTT; 0.01% Serva blue G250) was added to the protein samples (up to 50μl). Samples were heated at 70° C. for 10 minutes before loading onto aNuPage polyacrylamide gel. The electrophoresis conditions were 200 Vconstant for 1 hour on a 10% Bis-Tris precast polyacrylamide gel, using50 mM MOPS, 50 mM Tris, 1 mM EDTA, 3.5 mM SDS, pH 7.7 running buffer,according to the NuPage methods (Invitrogen, U.S. Pat. No. 5,578,180).

[0319] Electroblotting.

[0320] Separated proteins were transferred from the acrylamide gel ontoPVDF membrane by electroblotting (Transfer buffer: 20% methanol; 25 mMBicine pH 7.2; 25 mM Bis-Tris, 1 mM EDTA, 50 μM chlorobutanol) in aNovex blotting apparatus at 30 V for 1.5 hours.

[0321] Immunodetection.

[0322] After blocking the membrane with 5% milk powder in Tris bufferedsaline (TBS-Tween) (20 mM Tris, pH 7.6; 140 mM NaCl; 0.1% (v/v)Tween-20), the membrane was challenged with a rabbit anti-PGSIPantiserum at a suitable dilution in TBS-Tween. Specific cross-reactingproteins were detected using an anti-rabbit IgG-Horse radish peroxidaseconjugate secondary antibody and visualised using the enhancedchemiluminescence (ECL) reaction (Amersham Pharmacia).

[0323] Detection of mRNA.

[0324] Expression of PGSIP mRNA was analysed in plants by rtPCR or byNorthern blotting.

Example 20 Analysis of Leaf Starch Content

[0325] Samples of leaves from control and transformed Arabidopsisthaliana plants which had been grown for 24 hours under high light(about 60 mg) were taken in a microfuge tube and extracted with 100 μlof 45% HClO₄. This suspension was diluted with 1 ml of distilled waterand centrifuged (14000 rpm, 2 min.) Aliquots of the extracts were thenanalysed for starch content by taking 100 μl of the extract and mixingwith an equal volume of Lugol's solution, the optical density of whichwas then measured at 540 nm using a microplate reader. Standard starchmixtures were prepared in the same way and measured at the same time andthe starch content of the extracts was calculated by reference to thesestandards. TABLE 8 Starch contents of leaves of Arabidopsis thalianaplants transformed with pCL68 SCV (sense construct comprising SEQ IDNO: 1) compared with the starch contents of leaves of non transformed(ncc) control plants. Control value is the mean ± (the standard error ofthe mean) for three plants. leaf starch content ug/g fresh samplesweight (FWt). 37256 19.95 1-002 12.68 1-003 49.68 1-004 48.02 1-00513.88 37407 17.47 37437 49.55 37468 24.88 37499 8.65 37529 17.71 3756015.93 37590 9.95 37621 6.02 37257 21.9 37288 18.20 37316 11.82 3726122.85 37381 9.51 37412 13.21 37442 33.60 37473 17.96 37504 8.88 3753418.58 37565 11.98 37295 32.83 37323 38.43 37354 16.16 ncc 22.59 (±5.08)

[0326] The ncc value represents the mean and standard error for thethree control plants. Each data point otherwise represents a single leaffrom an individual plant. Taking the error of the control as a measureof the population variation, then plants 1-003, 1-004, 1-007, 1-008,6-007 and 9-003 have significantly more starch in their leaves than thecontrols. Plants 1-009, 1-012, 1-013, 2-003, 6-005, 6-009 and 6-011 havesignificantly lower starch contents. The copy number and level ofexpression of the sense construct in the plants are to be determined.The results demonstrate that a sense construct comprising SEQ ID NO: 1can effectively alter the content of starch. TABLE 9 Starch contents ofleaves of Arabidopsis thaliana plants transformed with pCL76 SCV (RNAiconstruct) compared to controls. starch content Samples μg per leafpCL76 SCV 7 27.20 pCL76 SCV 20.1 26.96 Control ncc 42.97

[0327] The data in these tables shows that the leaves of the transformedplants have an altered starch content compared to the untransformedcontrols (ncc).

Example 21 Microscopic Analysis of Starch Granule Size and Number

[0328] Starch granules were extracted from Arabidopsis thaliana orSolanum tuberosum tissue by taking 50-100 mg of tissue and homogenisingin 1% sodium metabisulphite solution. After filtering the extractthrough miracloth, the starch was collected by centrifugation, 1300 rpmfor 5 minutes and then resuspended in 1 ml of water. Aliquots were takenand an equal amount of Lugol solution added to enhance the contrast ofthe starch granules. Suspensions were prepared for microscope imaging byplacing onto a microscope slide. Representative micrographs were takenof the samples. The electronically captured images were then processedusing suitable image analysis software, such as the package ‘Imagej’.This enabled a quantification of the size distributions of differentstarch samples to be made and compared.

[0329] Alternatively, samples of purified starch are either suspended inwater and viewed with a light microscope or sputter-coated with gold andviewed with a scanning electron microscope such as a Phillips(Eindhoven, The Netherlands) XL30 Field Emission Gun scanning electronmicroscope at 3 kV.

[0330] Starch granules can be examined in tissues as well. For example,starch in tissues is stained using Lugol's solution (1% Lugol'ssolution, I-KI [1:2, v/v]; Merck). Starch can then be examined, forexample, in longitudinal sections of tubers. Alternatively the starchcan be further isolated subsequent to staining and suspended in water,and stained again with a few drops of Lungol's solution and examinedmicroscopically.

[0331] The radii of the blue staining core of the starch granules andthe total granule are measured microscopically using an ocularmicrometer. If granules are ovoid in shape, both long radius and shortradius measurements are taken. The radii of the blue-staining core andthe total granule are determined by measuring individual, randomlychosen starch granules.

Example 22 Analysis of Starch Functionality

[0332] Preparation of Starch.

[0333] Starch was extracted from potato tubers by taking 0.5-1 kg ofwashed tuber tissue and homogenising using a juicerator chased with 200ml of 1% Sodium bisulphite solution.

[0334] The starch was allowed to settle, the supernatant decanted offand the starch washed by resuspending in 200 ml of ice-cold water. Theresulting starch pellet was left to air dry. Once dried the starch wasstored at −20° C.

[0335] Alternatively, other methods can be utilized to isolate starch,for example, samples of tubers are first homogenized in extractionbuffer (10 mM EDTA, 50 mM Tris, pH 7.5, 1 mM DTT, 0.1% Na2S2O5). Theresulting fibrous substance is then washed several times with theextraction buffer and filtered. The filtrate is allowed to set at 4° C.and the supernatant is discarded after the starch granules have settled.Starch granules are then washed with extraction buffer, water, andacetone and dried at 4° C.

[0336] With maize and other cereal crops, seeds are soaked in 50 ml of a20 mM sodium acetate, pH 6.5, 10 mM mercuric chloride solution. After 24hr, the germ and pericarp are removed and 50 ml of fresh solution isadded for an additional 24 hr. Endosperm is repeatedly homogenized for 1minute intervals in a mortar and pestle, and freed starch granules arepurified by multiple extractions with saline and toluene (Boyer et al.,1976, Cereal Chemistry 53: 327-337). Granular starch is washed threetimes with double distilled water, once with acetone, and dried at 40°C.

[0337] Viscometric Analysis of Starch.

[0338] Starch samples were analysed for functionality by testingTheological properties using viscometric analysis (rapid visco analyzer(RVA) or differential scanning calorimetry (DSC)). Viscosity of starchescan also be measured by various other techniques. For example, a RapidVisco Analyser Series 4 instrument (Newport Scientific, SydneyAustralia) can be utilized with a 13 min profile where 2 g of starch areanalyzed in water at a concentration of 7.4% (w/v) and the analysis usedthe stirring and heating protocol that suggested by Newport Scientific.For longer profiles, 2.5 g starch samples are used at a concentration of10% (w/v). The sample is heated while stirring at 1.5° C. min⁻¹ from 50°C. to 95° C. for 15 min then cooled to 50° C. at 15 min⁻¹. Viscosity ismeasured in centipose (cP)

Example 23 Analysis of Fine Structure of Starch

[0339] Amylopectin Chain Length Distribution

[0340] One method for examining the fine structure of starch is ¹⁴Clabeling of amylopectin chains to determine chain lengths. Extractedstarch granules are suspended at 25 mg ml⁻ in medium comprising 100 mMBicine (pH 8.50, 25 mM potassium acetate, 10 mM DTT, 5 mM EDTA, 1 mMADP[U-¹⁴C] glucose at 18.5 GBq mol⁻¹ and 10 μl starch suspension in atotal volume of 100 μl, for each sample. Samples are then incubated for1 hour at 25° C. The incubation is terminated by addition of 3 ml 750ml⁻¹ aqueous methanol containing 10 g 1-1 KCL (methanol/KCL). Afterincubation for at least 5 minutes at room temperature, starch iscollected by centrifugation at 2000 g for 5 min. The supernatant isdisgarded and the pellet is resuspended in 0.3 ml distilled water. TheMethanol/KCL wash, centrifugation, and resuspension are repeated 2-4times. The resulting pellets are dried at room temperature, dissolvedwith 50 μl 1M NaOH, and diluted with 50 μl distilled water. To determinethe average length of amylopectin chains into which ¹⁴C wasincorporated, products of incubation with ADP[U-¹⁴C] glucose aredebranched with isoamylase and subjected to chromatography on a columnof Sepharose CL-4B. The glucan eluding earlier from the column consistsof longer chains than glucan eluding later from the column.

[0341] Another method for examining the fine structure of starch ischromatography without labeling. A 10 mg sample of isolated starch isdissolved in 100 ul 0.1 M NaOH for 1 hour at 95° C. The sample isdiluted in 900 μL water, 150 μl 1 M soduim citrate (pH 5.0). The starchis then debranched by adding 300 units of isoamylase, or hydrolysed with300 units of alpha-amylase, or beta-amylase for 24 hours at 37° C. A 100ul aliquot sample of the hydrolysed samples is analyzed withchromatography. For example HPAE-PAD chromatography (Carbo PAC PA-100column; Dionex, Idstein, Germany; flow 1 ml min⁻¹; buffer A: 150 mMNaOH; buffer B: 1 M sodium acetate in buffer A) with an applied gradientcomprising 0-5 min 100% A; 5-20 min 85% A, 15% B, 20-35 min 70% A, 30% B(linear); 35-80 min 50% A, 50% B (convex).

[0342] Alternatively, HPLC chromatography is utilized, where partiallyhydrolyzed debranched starch samples in 0.01 N NaOH (5 mg/ml), and 2 mlare applied to a size exclusion column (Sephadex G-75, 1.5×100 cm). Themobile phase is 0.01 N NaOH and the flow rate is 0.6-0.9 ml/min. Samplesare analyzed for total carbohydrate by the phenol-sulfuric acid test(Hodge and Hofreiter, 1962,Vol. 1, R. L. Whistler and M L Wolform(Eds.), Corporation. Version 7. Academic Press, New York, pp: 388-389)and the Park Johnson test for reduced ends (Porro et al., 1981, AnalBiochem. 118(2):301-6). Based on these to analyses the average chainlength for each fraction is calculated.

[0343] Amylopectin is further characterized by measuring the lowmolecular weight to high molecular weight chain ratio (on a weightbasis) according to the method of Hizukuri (Hizukuri, 1986, CarbohydrateResearch, 147, 342-347).

[0344] An alternative method for analyzing amylopectin chains is gelelectrophoresis. Starch samples are debranched with isoamylase,derivatised with fluorophore APTS, and subjected to gel electrophoresisin an Applied Biosystem DNA sequencer. Data are analized by Genescansoftware. The method allows for identification of authentic maltohexaoseand maltoheptaose as well as a determination of percent molardifferences and the degree of polymerization, distribution of chainlengths, between samples.

[0345] Amylose Content of Starch

[0346] Amylose percentages are determined by gel permeationchromatography according to Denyer et al. (Denyer et al., 1995, PlantCell Environ 18:1019-1026) or by gel filtration analysis according toBoyer and Liu (Boyer and Liu, 1985, Starch Starke 37:73-79).

[0347] Alternatively, the amylose contents are determinedspectrophotometrically in 1 to 2 mg isolated starch according to theiodometric method described by Hovenkamp-Hernelink et al. 1988.Amperometric titrations are performed according to Williams et al 1970to determine the average amylose content per sample.

Example 24 cDNA Isolation from Barley

[0348] A database search using the Arabidopsis genes AT3g18660 andat1g77130, against an in-house database identified two barley sequences.The accessions corresponding to Genbank: BE438665 and Genbank: BE438754showed significant similarity to the Arabidopsis PGSIP genes (9e-34).The sequences called Barley SEQ1 and Barley SEQ2 are shown in SEQ IDNO:s 16 and 18.

[0349] All publications, patents and patent applications mentioned inthis specification are herein incorporated by reference into thespecification to the same extent as if each individual publication,patent or patent application was specifically and individually indicatedto be incorporated herein by reference.

[0350] Those skilled in the art will recognize, or through routineexperimentation, will be able to ascertain many equivalents to theparticular embodiments of the invention described herein. The claimedinvention intends to encompass all such equivalents. Having herein abovedisclosed exemplary embodiments of the present invention, those skilledin the art will recognize that this disclosure is only exemplary suchthat various alternatives, adaptations, and modifications are within thescope of the invention, and are contemplated by the Applicants.Accordingly, the present invention is not limited to the specificembodiments as illustrated above, but is defined by the followingclaims.

1 35 1 3750 DNA Arabidopsis thaliana CAAT_signal (373)..(376)TATA_signal (424)..(428) intron (593)..(680) intron (919)..(1038) intron(1656)..(1761) intron (2537)..(2990) 1 aatatgtaca tgcaataaaa catagtaatatatttctttc cactatatat atatattgaa 60 ttcaatgact taaaaccttt caaaaaaatatttttgctta tataatcaag tgagttattg 120 gtaaagtgta tctttatttt gaaaaaaaaactcattattt tgaaaataaa ttatggttct 180 ctttacaaag aaatgatcaa agtttggtggacatatatat gtcaatcata agagagtcac 240 aaactgagaa tggagtttaa actaaagagctacaatatta tccacaattt aaaacatttt 300 attaaaatca cgataacttc aaaaagagaaaatcaaaaat taactttgtt aaaaaggtgg 360 gtatgaaaaa tacaattttc ttatttcctaacaaaaacaa aaatagaaac aaaggaaatg 420 tgatataaga agattaaaag agacgttatgtctcacctat atttgctctc tcctcttcct 480 tgtccaattc tactgtccca atccatcagttttatatggc aaactctccc gctgctcctg 540 cacccaccac cacaaccggt ggtgactcccggcgacgcct ctccgcgtcc atgtaagtgt 600 atagtataat actctctaag taatgattaaaaaaatctga acaaaatcgt ctaattgtgg 660 ctttgtgtgt gtttaagcag agaagcaatatgcaagagga gattccggag aaatagcaaa 720 ggaggtggca gatcggatat ggtgaaaccgtttaatatca taaatttttc gacacaagac 780 aaaaacagta gttgttgttg tttcaccaagtttcagatcg tgaagcttct cttgtttatc 840 cttctctctg ccactctctt caccattatctattctcctg aagcttatca tcattctctt 900 tcccactcat cttctcggta aatctatttcttttttccat caccaacatt tacattcttg 960 acctcaaaaa tgttcacatg caaatttttacttttgcctc tatctcttat aatactatct 1020 taaaattatg aaattagatg gatatggagaagacaagatc cacgttactt ctcggatctg 1080 gatataaact gggacgatgt gactaaaacccttgagaaca tcgaagaagg ccgtacgatc 1140 ggtgtcttga attttgattc gaacgagatccaacgatgga gagaagtatc caagagcaag 1200 gacaatgggg atgaagaaaa agttgttgtattgaatctag attacgcaga caagaatgtg 1260 acttgggacg cactatatcc agagtggatcgatgaggagc aagaaacaga ggtccctgtt 1320 tgtcctaata tcccgaacat taaggtacctacaagaagac tcgatctgat cgtcgtgaaa 1380 cttccttgtc ggaaagaagg gaattggtcgagagacgtcg ggagattgca tctacagcta 1440 gcggctgcaa ctgtggcggc ttcggccaaagggtttttca ggggacatgt gttttttgta 1500 tctagatgct ttccgattcc gaatcttttccggtgtaaag atcttgtgtc tcggagaggc 1560 gatgtttggt tgtacaaacc taatcttgataccttgagag acaagcttca gctgcctgta 1620 gggtcttgtg agctatctct tcctcttggcatccaaggta gaataaaaat gactcccgaa 1680 attacttgtt tagatttgaa aacaaatttgaaaaatcgtc gctaagttaa ctagtgtctg 1740 ttttcttcca tgaattttac agataggccaagcttaggaa accctaaaag agaagcttac 1800 gcaacaattc ttcactcagc tcacgtttacgtctgcggtg caatcgccgc ggctcagagc 1860 ataagacagt ctggttcgac gagagaccttgttatccttg ttgatgacaa catcagcggt 1920 taccaccgga gtggactaga agccgcgggttggcaaatcc ggacgataca gaggattcga 1980 aaccctaagg cagagaaaga tgcttacaacgaatggaact acagcaagtt ccggctatgg 2040 cagctgactg attacgacaa aatcattttcatcgacgcgg atctcttaat cttgagaaac 2100 atcgatttct tgttctcgat gcctgagatctcagctacag gaaacaatgg aactctgttt 2160 aattcaggag ttatggtgat cgagccttgcaactgtacgt ttcagcttct gatggaacat 2220 ataaacgaga ttgagtctta taacggtggagatcaaggtt acttaaacga ggtattcaca 2280 tggtggcacc ggattccaaa acatatgaatttcttgaagc atttttggat tggcgatgaa 2340 gatgacgcga aacgcaagaa aacagagctttttggagcag agcctcctgt tctttatgtt 2400 cttcattacc ttgggatgaa gccgtggttatgttaccgtg actacgactg taacttcaac 2460 tccgacatat tcgttgagtt tgctaccgatatcgctcatc gaaaatggtg gatggtccac 2520 gacgccatgc cacaggtgat tcactctctcctaaaaacct taatagaact caaaaatcac 2580 ataatatttt caatctcata ttgtgatcaatattcaaaat attattaggc gtttagtcat 2640 gcgttgagag actaactgca tagcattatttctttctcaa aaatttccaa aacttgaaaa 2700 aataaataaa ctaaaaatta cttactacccaagtttagaa taaccatatg aaatttgaat 2760 atacgaaaat cttggtgggt tagtaaatgcagaattagcc ccctacgcag taggcatcaa 2820 gttttaatgt ctatgtttta tacaccttataaaaaaatca tttcaaattt tctttcttta 2880 tgattagttt aaaaaaacat tggttggcagaaatataaaa atagttagac gttttcccaa 2940 attattctaa aattgtgacg gttagtaattaccatatatg atattttgca ggaacttcac 3000 caattctgtt acttgcgatc caagcaaaaggcacagctgg aatatgatcg ccggcaagca 3060 gaggccgcaa attatgccga cggtcattggaaaataagag taaaggaccc gagattcaaa 3120 atttgcatcg acaaattatg taattggaaaagtatgctgc ggcattgggg cgaatcaaat 3180 tggactgact acgagtcttt tgttcccaccccaccagcca ttaccgtaga ccggagatca 3240 tcacttcccg gccataactt gtgacgcaataattatacat acttattaat ggatttcatg 3300 agttttttgg tttgaattgt tgctgcgagattaggtgaat atcagttgtg taactatatc 3360 tttttcctat agtttgttca aattgaataaaacatttttt tgcagtttaa ccacaaaata 3420 aaacatatgt cgtatttata tgccatttttgtatacaaac acaaactcaa aaatgttagt 3480 aacattcaaa tagtttatac agaaacgatagattatagac ttacatatag ccaaacaaca 3540 caaattaatt gatgtaacta aacatatgtagtataattaa actttcgaaa aatccaaatt 3600 tttagtcgaa tcgcagtgta gtatgtatacattacgtata gtatataaat ctatgtgtgt 3660 gtatatcagt gtatgtattt gtgtatgtatgtacatgtga aaagaatctc tactaaagat 3720 ttccataata ttcaaccaaa aaccaaagtt3750 2 1980 DNA Arabidopsis thaliana CDS (1)..(1980) transit_peptide(1)..(195) 2 atg gca aac tct ccc gct gct cct gca ccc acc acc aca acc ggtggt 48 Met Ala Asn Ser Pro Ala Ala Pro Ala Pro Thr Thr Thr Thr Gly Gly 15 10 15 gac tcc cgg cga cgc ctc tcc gcg tcc ata gaa gca ata tgc aag agg96 Asp Ser Arg Arg Arg Leu Ser Ala Ser Ile Glu Ala Ile Cys Lys Arg 20 2530 aga ttc cgg aga aat agc aaa gga ggt ggc aga tcg gat atg gtg aaa 144Arg Phe Arg Arg Asn Ser Lys Gly Gly Gly Arg Ser Asp Met Val Lys 35 40 45ccg ttt aat atc ata aat ttt tcg aca caa gac aaa aac agt agt tgt 192 ProPhe Asn Ile Ile Asn Phe Ser Thr Gln Asp Lys Asn Ser Ser Cys 50 55 60 tgttgt ttc acc aag ttt cag atc gtg aag ctt ctc ttg ttt atc ctt 240 Cys CysPhe Thr Lys Phe Gln Ile Val Lys Leu Leu Leu Phe Ile Leu 65 70 75 80 ctctct gcc act ctc ttc acc att atc tat tct cct gaa gct tat cat 288 Leu SerAla Thr Leu Phe Thr Ile Ile Tyr Ser Pro Glu Ala Tyr His 85 90 95 cat tctctt tcc cac tca tct tct cgg tgg ata tgg aga aga caa gat 336 His Ser LeuSer His Ser Ser Ser Arg Trp Ile Trp Arg Arg Gln Asp 100 105 110 cca cgttac ttc tcg gat ctg gat ata aac tgg gac gat gtg act aaa 384 Pro Arg TyrPhe Ser Asp Leu Asp Ile Asn Trp Asp Asp Val Thr Lys 115 120 125 acc cttgag aac atc gaa gaa ggc cgt acg atc ggt gtc ttg aat ttt 432 Thr Leu GluAsn Ile Glu Glu Gly Arg Thr Ile Gly Val Leu Asn Phe 130 135 140 gat tcgaac gag atc caa cga tgg aga gaa gta tcc aag agc aag gac 480 Asp Ser AsnGlu Ile Gln Arg Trp Arg Glu Val Ser Lys Ser Lys Asp 145 150 155 160 aatggg gat gaa gaa aaa gtt gtt gta ttg aat cta gat tac gca gac 528 Asn GlyAsp Glu Glu Lys Val Val Val Leu Asn Leu Asp Tyr Ala Asp 165 170 175 aagaat gtg act tgg gac gca cta tat cca gag tgg atc gat gag gag 576 Lys AsnVal Thr Trp Asp Ala Leu Tyr Pro Glu Trp Ile Asp Glu Glu 180 185 190 caagaa aca gag gtc cct gtt tgt cct aat atc ccg aac att aag gta 624 Gln GluThr Glu Val Pro Val Cys Pro Asn Ile Pro Asn Ile Lys Val 195 200 205 cctaca aga aga ctc gat ctg atc gtc gtg aaa ctt cct tgt cgg aaa 672 Pro ThrArg Arg Leu Asp Leu Ile Val Val Lys Leu Pro Cys Arg Lys 210 215 220 gaaggg aat tgg tcg aga gac gtc ggg aga ttg cat cta cag cta gcg 720 Glu GlyAsn Trp Ser Arg Asp Val Gly Arg Leu His Leu Gln Leu Ala 225 230 235 240gct gca act gtg gcg gct tcg gcc aaa ggg ttt ttc agg gga cat gtg 768 AlaAla Thr Val Ala Ala Ser Ala Lys Gly Phe Phe Arg Gly His Val 245 250 255ttt ttt gta tct aga tgc ttt ccg att ccg aat ctt ttc cgg tgt aaa 816 PhePhe Val Ser Arg Cys Phe Pro Ile Pro Asn Leu Phe Arg Cys Lys 260 265 270gat ctt gtg tct cgg aga ggc gat gtt tgg ttg tac aaa cct aat ctt 864 AspLeu Val Ser Arg Arg Gly Asp Val Trp Leu Tyr Lys Pro Asn Leu 275 280 285gat acc ttg aga gac aag ctt cag ctg cct gta ggg tct tgt gag cta 912 AspThr Leu Arg Asp Lys Leu Gln Leu Pro Val Gly Ser Cys Glu Leu 290 295 300tct ctt cct ctt ggc atc caa gat agg cca agc tta gga aac cct aaa 960 SerLeu Pro Leu Gly Ile Gln Asp Arg Pro Ser Leu Gly Asn Pro Lys 305 310 315320 aga gaa gct tac gca aca att ctt cac tca gct cac gtt tac gtc tgc 1008Arg Glu Ala Tyr Ala Thr Ile Leu His Ser Ala His Val Tyr Val Cys 325 330335 ggt gca atc gcc gcg gct cag agc ata aga cag tct ggt tcg acg aga 1056Gly Ala Ile Ala Ala Ala Gln Ser Ile Arg Gln Ser Gly Ser Thr Arg 340 345350 gac ctt gtt atc ctt gtt gat gac aac atc agc ggt tac cac cgg agt 1104Asp Leu Val Ile Leu Val Asp Asp Asn Ile Ser Gly Tyr His Arg Ser 355 360365 gga cta gaa gcc gcg ggt tgg caa atc cgg acg ata cag agg att cga 1152Gly Leu Glu Ala Ala Gly Trp Gln Ile Arg Thr Ile Gln Arg Ile Arg 370 375380 aac cct aag gca gag aaa gat gct tac aac gaa tgg aac tac agc aag 1200Asn Pro Lys Ala Glu Lys Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys 385 390395 400 ttc cgg cta tgg cag ctg act gat tac gac aaa atc att ttc atc gac1248 Phe Arg Leu Trp Gln Leu Thr Asp Tyr Asp Lys Ile Ile Phe Ile Asp 405410 415 gcg gat ctc tta atc ttg aga aac atc gat ttc ttg ttc tcg atg cct1296 Ala Asp Leu Leu Ile Leu Arg Asn Ile Asp Phe Leu Phe Ser Met Pro 420425 430 gag atc tca gct aca gga aac aat gga act ctg ttt aat tca gga gtt1344 Glu Ile Ser Ala Thr Gly Asn Asn Gly Thr Leu Phe Asn Ser Gly Val 435440 445 atg gtg atc gag cct tgc aac tgt acg ttt cag ctt ctg atg gaa cat1392 Met Val Ile Glu Pro Cys Asn Cys Thr Phe Gln Leu Leu Met Glu His 450455 460 ata aac gag att gag tct tat aac ggt gga gat caa ggt tac tta aac1440 Ile Asn Glu Ile Glu Ser Tyr Asn Gly Gly Asp Gln Gly Tyr Leu Asn 465470 475 480 gag gta ttc aca tgg tgg cac cgg att cca aaa cat atg aat ttcttg 1488 Glu Val Phe Thr Trp Trp His Arg Ile Pro Lys His Met Asn Phe Leu485 490 495 aag cat ttt tgg att ggc gat gaa gat gac gcg aaa cgc aag aaaaca 1536 Lys His Phe Trp Ile Gly Asp Glu Asp Asp Ala Lys Arg Lys Lys Thr500 505 510 gag ctt ttt gga gca gag cct cct gtt ctt tat gtt ctt cat tacctt 1584 Glu Leu Phe Gly Ala Glu Pro Pro Val Leu Tyr Val Leu His Tyr Leu515 520 525 ggg atg aag ccg tgg tta tgt tac cgt gac tac gac tgt aac ttcaac 1632 Gly Met Lys Pro Trp Leu Cys Tyr Arg Asp Tyr Asp Cys Asn Phe Asn530 535 540 tcc gac ata ttc gtt gag ttt gct acc gat atc gct cat cga aaatgg 1680 Ser Asp Ile Phe Val Glu Phe Ala Thr Asp Ile Ala His Arg Lys Trp545 550 555 560 tgg atg gtc cac gac gcc atg cca cag gaa ctt cac caa ttctgt tac 1728 Trp Met Val His Asp Ala Met Pro Gln Glu Leu His Gln Phe CysTyr 565 570 575 ttg cga tcc aag caa aag gca cag ctg gaa tat gat cgc cggcaa gca 1776 Leu Arg Ser Lys Gln Lys Ala Gln Leu Glu Tyr Asp Arg Arg GlnAla 580 585 590 gag gcc gca aat tat gcc gac ggt cat tgg aaa ata aga gtaaag gac 1824 Glu Ala Ala Asn Tyr Ala Asp Gly His Trp Lys Ile Arg Val LysAsp 595 600 605 ccg aga ttc aaa att tgc atc gac aaa tta tgt aat tgg aaaagt atg 1872 Pro Arg Phe Lys Ile Cys Ile Asp Lys Leu Cys Asn Trp Lys SerMet 610 615 620 ctg cgg cat tgg ggc gaa tca aat tgg act gac tac gag tctttt gtt 1920 Leu Arg His Trp Gly Glu Ser Asn Trp Thr Asp Tyr Glu Ser PheVal 625 630 635 640 ccc acc cca cca gcc att acc gta gac cgg aga tca tcactt ccc ggc 1968 Pro Thr Pro Pro Ala Ile Thr Val Asp Arg Arg Ser Ser LeuPro Gly 645 650 655 cat aac ttg tga 1980 His Asn Leu * 3 659 PRTArabidopsis thaliana 3 Met Ala Asn Ser Pro Ala Ala Pro Ala Pro Thr ThrThr Thr Gly Gly 1 5 10 15 Asp Ser Arg Arg Arg Leu Ser Ala Ser Ile GluAla Ile Cys Lys Arg 20 25 30 Arg Phe Arg Arg Asn Ser Lys Gly Gly Gly ArgSer Asp Met Val Lys 35 40 45 Pro Phe Asn Ile Ile Asn Phe Ser Thr Gln AspLys Asn Ser Ser Cys 50 55 60 Cys Cys Phe Thr Lys Phe Gln Ile Val Lys LeuLeu Leu Phe Ile Leu 65 70 75 80 Leu Ser Ala Thr Leu Phe Thr Ile Ile TyrSer Pro Glu Ala Tyr His 85 90 95 His Ser Leu Ser His Ser Ser Ser Arg TrpIle Trp Arg Arg Gln Asp 100 105 110 Pro Arg Tyr Phe Ser Asp Leu Asp IleAsn Trp Asp Asp Val Thr Lys 115 120 125 Thr Leu Glu Asn Ile Glu Glu GlyArg Thr Ile Gly Val Leu Asn Phe 130 135 140 Asp Ser Asn Glu Ile Gln ArgTrp Arg Glu Val Ser Lys Ser Lys Asp 145 150 155 160 Asn Gly Asp Glu GluLys Val Val Val Leu Asn Leu Asp Tyr Ala Asp 165 170 175 Lys Asn Val ThrTrp Asp Ala Leu Tyr Pro Glu Trp Ile Asp Glu Glu 180 185 190 Gln Glu ThrGlu Val Pro Val Cys Pro Asn Ile Pro Asn Ile Lys Val 195 200 205 Pro ThrArg Arg Leu Asp Leu Ile Val Val Lys Leu Pro Cys Arg Lys 210 215 220 GluGly Asn Trp Ser Arg Asp Val Gly Arg Leu His Leu Gln Leu Ala 225 230 235240 Ala Ala Thr Val Ala Ala Ser Ala Lys Gly Phe Phe Arg Gly His Val 245250 255 Phe Phe Val Ser Arg Cys Phe Pro Ile Pro Asn Leu Phe Arg Cys Lys260 265 270 Asp Leu Val Ser Arg Arg Gly Asp Val Trp Leu Tyr Lys Pro AsnLeu 275 280 285 Asp Thr Leu Arg Asp Lys Leu Gln Leu Pro Val Gly Ser CysGlu Leu 290 295 300 Ser Leu Pro Leu Gly Ile Gln Asp Arg Pro Ser Leu GlyAsn Pro Lys 305 310 315 320 Arg Glu Ala Tyr Ala Thr Ile Leu His Ser AlaHis Val Tyr Val Cys 325 330 335 Gly Ala Ile Ala Ala Ala Gln Ser Ile ArgGln Ser Gly Ser Thr Arg 340 345 350 Asp Leu Val Ile Leu Val Asp Asp AsnIle Ser Gly Tyr His Arg Ser 355 360 365 Gly Leu Glu Ala Ala Gly Trp GlnIle Arg Thr Ile Gln Arg Ile Arg 370 375 380 Asn Pro Lys Ala Glu Lys AspAla Tyr Asn Glu Trp Asn Tyr Ser Lys 385 390 395 400 Phe Arg Leu Trp GlnLeu Thr Asp Tyr Asp Lys Ile Ile Phe Ile Asp 405 410 415 Ala Asp Leu LeuIle Leu Arg Asn Ile Asp Phe Leu Phe Ser Met Pro 420 425 430 Glu Ile SerAla Thr Gly Asn Asn Gly Thr Leu Phe Asn Ser Gly Val 435 440 445 Met ValIle Glu Pro Cys Asn Cys Thr Phe Gln Leu Leu Met Glu His 450 455 460 IleAsn Glu Ile Glu Ser Tyr Asn Gly Gly Asp Gln Gly Tyr Leu Asn 465 470 475480 Glu Val Phe Thr Trp Trp His Arg Ile Pro Lys His Met Asn Phe Leu 485490 495 Lys His Phe Trp Ile Gly Asp Glu Asp Asp Ala Lys Arg Lys Lys Thr500 505 510 Glu Leu Phe Gly Ala Glu Pro Pro Val Leu Tyr Val Leu His TyrLeu 515 520 525 Gly Met Lys Pro Trp Leu Cys Tyr Arg Asp Tyr Asp Cys AsnPhe Asn 530 535 540 Ser Asp Ile Phe Val Glu Phe Ala Thr Asp Ile Ala HisArg Lys Trp 545 550 555 560 Trp Met Val His Asp Ala Met Pro Gln Glu LeuHis Gln Phe Cys Tyr 565 570 575 Leu Arg Ser Lys Gln Lys Ala Gln Leu GluTyr Asp Arg Arg Gln Ala 580 585 590 Glu Ala Ala Asn Tyr Ala Asp Gly HisTrp Lys Ile Arg Val Lys Asp 595 600 605 Pro Arg Phe Lys Ile Cys Ile AspLys Leu Cys Asn Trp Lys Ser Met 610 615 620 Leu Arg His Trp Gly Glu SerAsn Trp Thr Asp Tyr Glu Ser Phe Val 625 630 635 640 Pro Thr Pro Pro AlaIle Thr Val Asp Arg Arg Ser Ser Leu Pro Gly 645 650 655 His Asn Leu 4560 DNA Zea mays 4 aaaattagca gcagccacag caagaggcaa tagaggaattcatgtgctgt ttctgactga 60 ttgcttccca attccaaacc tcttctcttg caaggacctagtgaaacgtg aaggcaatgc 120 ttggatgtac aaacctgacg tgaaggctct aaaggagaagctcaggctgc ctgttggttc 180 ctgtgagctt gctgttccac tcaacgcaaa agcacgactctacacggtag acagacgcag 240 agaagcatat gctacaatac tgcattcagc aagtgaatatgtttgcggtg cgataacagc 300 agctcaaagc attcgtcaag caggatcaac aagagaccttgttattcttg ttgatgacac 360 cataagtgac taccaccgca aggggctgga atctgctgggtggaaggtta gaataataca 420 gaggatccgg aatcccaaag cggaacgtga tgcctacaacgaatggaact acagcaaatt 480 ccggctgtgg cagcttacag attacgacaa ggttattttcattgatgctg atctgctcat 540 cctgaggaac attgatttct 560 5 1034 DNA Zea mays5 gacgcgtaca acgagtggaa ctacagcaag ttcaggctgt ggcagctgac cgactacgac 60aaggtcatct tcatagacgc cgacctcctc atcctgagga acgtcgactt cctgttcgcc 120atgccggaga tcgccgcgac ggscaacaac gccacgctct tcaactccgg cgtcatggtc 180gtcgagccct ccaactgcac gttccgcctg ctcatggacc acatcgacga gatcacctcg 240tacaacggcg gggaccaggg gtacctcaac gagatattca cgtggtggca ccgcgtcccc 300aggcacatga acttcctcaa gcacttctgg gagggcgaca gcgaggccat gaaggcgaag 360aagacacagc tgttcggcgc ggacccgccg gtcctctacg tcctccacta ccttggcctc 420aagccgtggc tgtgcttcag agactacgac tgcaactgga acaacgccgg gatgcgcgag 480ttcgccagcg acgtcgcgca tgcccggtgg tggaaggtgc acgacaggat gccccggaag 540ctccagtcct actgcctgct gaggtcgcgg cagaaggcca ggctggagtg ggaccggagg 600caggccgaga aggccaactc tcaagatggc cactggcgcc tcaacgtcac ggacaccagg 660ctcaagacgt gctttgagaa gttctgcttc tgggagagca tgctctggca ttggggcgag 720aacagtaaca ggaccaagag cgtccccatg gcagccacga cggcaaggtc gtgatctgta 780gatatacgaa caccccatcc ccatatggca accatacatg catagcaata gcttgtatag 840gtagctatgc tttagttctt cgctatatat acagaataca ccactcgatc cctgttgttg 900tcaaggctgc agctctatgt cgctgccggc ctgccaccat ggctaacgat tcttttgggt 960tggctgctgt aataagtttc aggtacatgt aaatttccct gctgaaatta cgtgaccgcg 1020ttgagaaatg aatt 1034 6 3606 DNA Arabidopsis thaliana CDS (1)..(3606) 6atg tgt gtc aac ttc tct agt ctg aaa ctt gtt ttg ttt ctt atg atg 48 MetCys Val Asn Phe Ser Ser Leu Lys Leu Val Leu Phe Leu Met Met 1 5 10 15ctg gtt gct atg ttc aca ctc tac tgt tct cca ccg ttg caa att cct 96 LeuVal Ala Met Phe Thr Leu Tyr Cys Ser Pro Pro Leu Gln Ile Pro 20 25 30 gaagat cca tca agt ttt gca aac aaa tgg ata cta gaa cct gct gta 144 Glu AspPro Ser Ser Phe Ala Asn Lys Trp Ile Leu Glu Pro Ala Val 35 40 45 acc acagat cct cgc tat ata gct aca tct gag atc aac tgg aac agt 192 Thr Thr AspPro Arg Tyr Ile Ala Thr Ser Glu Ile Asn Trp Asn Ser 50 55 60 atg tca cttgtt gtt gag cat tac tta tct ggc aga agc gag tat caa 240 Met Ser Leu ValVal Glu His Tyr Leu Ser Gly Arg Ser Glu Tyr Gln 65 70 75 80 gga att ggcttt cta aat ctc aac gat aac gag att aat cga tgg cag 288 Gly Ile Gly PheLeu Asn Leu Asn Asp Asn Glu Ile Asn Arg Trp Gln 85 90 95 gtg gtc ata aaatct cac tgt cag cat ata gct ttg cat cta gac cat 336 Val Val Ile Lys SerHis Cys Gln His Ile Ala Leu His Leu Asp His 100 105 110 gct gca agt aacata act tgg aaa tct tta tac ccg gaa tgg att gac 384 Ala Ala Ser Asn IleThr Trp Lys Ser Leu Tyr Pro Glu Trp Ile Asp 115 120 125 gag gaa gaa aaattc aaa gtc ccc act tgt cct tct ctt cct tgg att 432 Glu Glu Glu Lys PheLys Val Pro Thr Cys Pro Ser Leu Pro Trp Ile 130 135 140 caa gtt cct gacaag tct cga atc gat ctt atc att gcc aag ctc cca 480 Gln Val Pro Asp LysSer Arg Ile Asp Leu Ile Ile Ala Lys Leu Pro 145 150 155 160 tgt aac aagtca gga aaa tgg tca aga gat gtg gct aga ttg cac tta 528 Cys Asn Lys SerGly Lys Trp Ser Arg Asp Val Ala Arg Leu His Leu 165 170 175 caa ctt gcagca gct cga gtg gcg gca tct tct gaa ggg ctt cat gat 576 Gln Leu Ala AlaAla Arg Val Ala Ala Ser Ser Glu Gly Leu His Asp 180 185 190 gtt cat gtgatt ttg gta tca gat tgc ttt cca ata ccg aat ctt ttt 624 Val His Val IleLeu Val Ser Asp Cys Phe Pro Ile Pro Asn Leu Phe 195 200 205 acg ggt caagaa ctt gtt gcc cgt caa gga aac ata tgg ctg tat aag 672 Thr Gly Gln GluLeu Val Ala Arg Gln Gly Asn Ile Trp Leu Tyr Lys 210 215 220 cct aaa cttcac cag tta aga caa aag tta caa ctt cct gtt ggt tcc 720 Pro Lys Leu HisGln Leu Arg Gln Lys Leu Gln Leu Pro Val Gly Ser 225 230 235 240 tgt gaactt tct gtt cct ctt caa gct aaa gat aat ttc tac tcg gca 768 Cys Glu LeuSer Val Pro Leu Gln Ala Lys Asp Asn Phe Tyr Ser Ala 245 250 255 aat gccaag aaa gaa gcg tac gcg acg atc ttg cac tca gat gat gct 816 Asn Ala LysLys Glu Ala Tyr Ala Thr Ile Leu His Ser Asp Asp Ala 260 265 270 ttt gtctgt gga gcc att gca gta gca cag agc att cga atg tca ggc 864 Phe Val CysGly Ala Ile Ala Val Ala Gln Ser Ile Arg Met Ser Gly 275 280 285 tct actcgc aat ttg gta ata cta gtc gat gat tcg atc agt gaa tac 912 Ser Thr ArgAsn Leu Val Ile Leu Val Asp Asp Ser Ile Ser Glu Tyr 290 295 300 cat agaagt ggc ttg gaa tca gct gga tgg aag att cac aca ttt caa 960 His Arg SerGly Leu Glu Ser Ala Gly Trp Lys Ile His Thr Phe Gln 305 310 315 320 agaatc aga aac ccg aaa gct gaa gca aat gca tat aac caa tgg aac 1008 Arg IleArg Asn Pro Lys Ala Glu Ala Asn Ala Tyr Asn Gln Trp Asn 325 330 335 tacagc aaa ttc cgt ctt tgg gaa ttg aca gaa tac aac aag atc atc 1056 Tyr SerLys Phe Arg Leu Trp Glu Leu Thr Glu Tyr Asn Lys Ile Ile 340 345 350 ttcatt gat gca gac atg ctt atc ctc aga aac atg gat ttc ctc ttc 1104 Phe IleAsp Ala Asp Met Leu Ile Leu Arg Asn Met Asp Phe Leu Phe 355 360 365 gagtac ccc gaa atc tcc aca act gga aac gac ggt acg ctc ttc aac 1152 Glu TyrPro Glu Ile Ser Thr Thr Gly Asn Asp Gly Thr Leu Phe Asn 370 375 380 tccggt cta atg gtg att gaa cca tca aat tca aca ttc cag tta cta 1200 Ser GlyLeu Met Val Ile Glu Pro Ser Asn Ser Thr Phe Gln Leu Leu 385 390 395 400atg gat cac atc aac gat atc aat tcc tac aat gga gga gac caa ggt 1248 MetAsp His Ile Asn Asp Ile Asn Ser Tyr Asn Gly Gly Asp Gln Gly 405 410 415tac ctt aac gag ata ttc aca tgg tgg cat cgg att cca aaa cac atg 1296 TyrLeu Asn Glu Ile Phe Thr Trp Trp His Arg Ile Pro Lys His Met 420 425 430aat ttc ttg aag cat ttc tgg gaa gga gac aca cct aag cac agg aaa 1344 AsnPhe Leu Lys His Phe Trp Glu Gly Asp Thr Pro Lys His Arg Lys 435 440 445tct aag acg aga cta ttt gga gct gat cct ccg ata ctc tac gtt ctt 1392 SerLys Thr Arg Leu Phe Gly Ala Asp Pro Pro Ile Leu Tyr Val Leu 450 455 460cat tac cta ggt tac aac aaa cca tgg gta tgc ttc aga gac tac gat 1440 HisTyr Leu Gly Tyr Asn Lys Pro Trp Val Cys Phe Arg Asp Tyr Asp 465 470 475480 tgc aat tgg aat gtc gtt gga tac cat caa ttc gcg agc gat gaa gca 1488Cys Asn Trp Asn Val Val Gly Tyr His Gln Phe Ala Ser Asp Glu Ala 485 490495 cac aaa act tgg tgg aga gtg cac gac gcg atg cct aag aaa ttg cag 1536His Lys Thr Trp Trp Arg Val His Asp Ala Met Pro Lys Lys Leu Gln 500 505510 agg ttt tgt cta ctg agt tcg aaa caa aag gcg caa ctt gag tgg gat 1584Arg Phe Cys Leu Leu Ser Ser Lys Gln Lys Ala Gln Leu Glu Trp Asp 515 520525 cgg aga caa gct gag aaa gcg aat tac aga gac gga cat tgg agg att 1632Arg Arg Gln Ala Glu Lys Ala Asn Tyr Arg Asp Gly His Trp Arg Ile 530 535540 aag atc aaa gat aag aga ctt acg act tgt ttt gaa gat ttc tgt ttc 1680Lys Ile Lys Asp Lys Arg Leu Thr Thr Cys Phe Glu Asp Phe Cys Phe 545 550555 560 tgg gag agt atg ctt tgg cat tgg ggc gat tat gaa att ctc gaa acc1728 Trp Glu Ser Met Leu Trp His Trp Gly Asp Tyr Glu Ile Leu Glu Thr 565570 575 gac cct ggt ctt acg gag acg atg ata cct tcc tca agt ccc atg gag1776 Asp Pro Gly Leu Thr Glu Thr Met Ile Pro Ser Ser Ser Pro Met Glu 580585 590 tca aga cat cga ctc tcg ttc tca aat gag aag aca agt agg agg aga1824 Ser Arg His Arg Leu Ser Phe Ser Asn Glu Lys Thr Ser Arg Arg Arg 595600 605 ttt caa aga att gag aag ggt gtc aag ttc aac act ctg aaa ctt gtg1872 Phe Gln Arg Ile Glu Lys Gly Val Lys Phe Asn Thr Leu Lys Leu Val 610615 620 ttg att tgt ata atg ctt gga gct ttg ttc acg atc tac cgt ttt cgt1920 Leu Ile Cys Ile Met Leu Gly Ala Leu Phe Thr Ile Tyr Arg Phe Arg 625630 635 640 tat cca ccg cta caa att cct gaa att cca act agt ttt ggt cttact 1968 Tyr Pro Pro Leu Gln Ile Pro Glu Ile Pro Thr Ser Phe Gly Leu Thr645 650 655 act gat cct cgc tat gta gct aca gct gag atc aac tgg aac catatg 2016 Thr Asp Pro Arg Tyr Val Ala Thr Ala Glu Ile Asn Trp Asn His Met660 665 670 tca aat ctt gtt gag aag cac gta ttt ggt aga agc gag tat caagga 2064 Ser Asn Leu Val Glu Lys His Val Phe Gly Arg Ser Glu Tyr Gln Gly675 680 685 att ggt ctt ata aat ctt aac gat aac gag att gat cga ttc aaggag 2112 Ile Gly Leu Ile Asn Leu Asn Asp Asn Glu Ile Asp Arg Phe Lys Glu690 695 700 gta acg aaa tct gac tgt gat cat gta gct ttg cat cta gat tatgct 2160 Val Thr Lys Ser Asp Cys Asp His Val Ala Leu His Leu Asp Tyr Ala705 710 715 720 gca aag aac ata aca tgg gaa tct tta tac ccg gaa tgg attgat gaa 2208 Ala Lys Asn Ile Thr Trp Glu Ser Leu Tyr Pro Glu Trp Ile AspGlu 725 730 735 gtt gaa gaa ttc gaa gtc cct act tgt cct tct ctg cct ttgatt caa 2256 Val Glu Glu Phe Glu Val Pro Thr Cys Pro Ser Leu Pro Leu IleGln 740 745 750 att cct ggc aag cct cgg att gat ctt gta att gcc aag cttccg tgt 2304 Ile Pro Gly Lys Pro Arg Ile Asp Leu Val Ile Ala Lys Leu ProCys 755 760 765 gat aaa tca gga aaa tgg tct aga gat gtg gct cgc ttg cattta caa 2352 Asp Lys Ser Gly Lys Trp Ser Arg Asp Val Ala Arg Leu His LeuGln 770 775 780 ctt gca gca gct cga gtg gcg gct tct tct aaa gga ctt cataat gtt 2400 Leu Ala Ala Ala Arg Val Ala Ala Ser Ser Lys Gly Leu His AsnVal 785 790 795 800 cat gtg att ttg gta tct gat tgc ttt cca ata ccg aatctt ttt acg 2448 His Val Ile Leu Val Ser Asp Cys Phe Pro Ile Pro Asn LeuPhe Thr 805 810 815 ggt caa gaa ctt gtt gcc cgt caa gga aac ata tgg ctgtat aag cct 2496 Gly Gln Glu Leu Val Ala Arg Gln Gly Asn Ile Trp Leu TyrLys Pro 820 825 830 aat ctt cac cag cta aga caa aag tta cag ctt cct gttggt tcc tgt 2544 Asn Leu His Gln Leu Arg Gln Lys Leu Gln Leu Pro Val GlySer Cys 835 840 845 gaa ctt tct gtt cct ctt caa gct aaa gat aat ttc tactcc gca ggt 2592 Glu Leu Ser Val Pro Leu Gln Ala Lys Asp Asn Phe Tyr SerAla Gly 850 855 860 gca aag aaa gaa gct tac gcg act atc ttg cat tct gcccaa ttt tat 2640 Ala Lys Lys Glu Ala Tyr Ala Thr Ile Leu His Ser Ala GlnPhe Tyr 865 870 875 880 gtc tgt gga gcc att gca gct gca cag agc att cgaatg tca ggc tct 2688 Val Cys Gly Ala Ile Ala Ala Ala Gln Ser Ile Arg MetSer Gly Ser 885 890 895 act cgt gat ctg gtc ata ctt gtt gat gaa acg ataagc gaa tac cat 2736 Thr Arg Asp Leu Val Ile Leu Val Asp Glu Thr Ile SerGlu Tyr His 900 905 910 aaa agt ggc ttg gta gct gct gga tgg aag att caaatg ttt caa aga 2784 Lys Ser Gly Leu Val Ala Ala Gly Trp Lys Ile Gln MetPhe Gln Arg 915 920 925 atc agg aac ccg aat gct gta cca aat gcc tac aacgaa tgg aac tac 2832 Ile Arg Asn Pro Asn Ala Val Pro Asn Ala Tyr Asn GluTrp Asn Tyr 930 935 940 agc aag ttt cgt ctt tgg caa ctg act gaa tac agtaag atc atc ttc 2880 Ser Lys Phe Arg Leu Trp Gln Leu Thr Glu Tyr Ser LysIle Ile Phe 945 950 955 960 atc gat gca gac atg ctt atc ctg aga aac attgat ttc ctc ttc gag 2928 Ile Asp Ala Asp Met Leu Ile Leu Arg Asn Ile AspPhe Leu Phe Glu 965 970 975 ttc cct gag ata tca gca act gga aac aat gctacg ctc ttc aac tct 2976 Phe Pro Glu Ile Ser Ala Thr Gly Asn Asn Ala ThrLeu Phe Asn Ser 980 985 990 ggt cta atg gtg gtt gag cca tct aat tca acattc cag tta cta atg 3024 Gly Leu Met Val Val Glu Pro Ser Asn Ser Thr PheGln Leu Leu Met 995 1000 1005 gat aac att aat gaa gtt gtg tct tac aacgga gga gac caa ggt tac 3072 Asp Asn Ile Asn Glu Val Val Ser Tyr Asn GlyGly Asp Gln Gly Tyr 1010 1015 1020 ctt aac gag ata ttc aca tgg tgg catcgg att cca aaa cac atg aat 3120 Leu Asn Glu Ile Phe Thr Trp Trp His ArgIle Pro Lys His Met Asn 1025 1030 1035 1040 ttc ttg aag cat ttc tgg gaagga gac gaa cct gag att aaa aaa atg 3168 Phe Leu Lys His Phe Trp Glu GlyAsp Glu Pro Glu Ile Lys Lys Met 1045 1050 1055 aag acg agt cta ttt ggagct gat cct ccg atc cta tac gtt ctt cat 3216 Lys Thr Ser Leu Phe Gly AlaAsp Pro Pro Ile Leu Tyr Val Leu His 1060 1065 1070 tac cta ggt tat aacaaa ccc tgg tta tgc ttc aga gac tat gac tgc 3264 Tyr Leu Gly Tyr Asn LysPro Trp Leu Cys Phe Arg Asp Tyr Asp Cys 1075 1080 1085 aat tgg aat gtcgat att ttc cag gaa ttt gct agt gac gag gct cat 3312 Asn Trp Asn Val AspIle Phe Gln Glu Phe Ala Ser Asp Glu Ala His 1090 1095 1100 aaa acc tggtgg aga gtg cac gac gca atg cct gaa aac ttg cat aag 3360 Lys Thr Trp TrpArg Val His Asp Ala Met Pro Glu Asn Leu His Lys 1105 1110 1115 1120 ttctgt cta cta aga tcg aaa cag aag gcg caa ctt gaa tgg gat agg 3408 Phe CysLeu Leu Arg Ser Lys Gln Lys Ala Gln Leu Glu Trp Asp Arg 1125 1130 1135aga caa gca gag aaa ggg aac tac aaa gat gga cat tgg aag ata aag 3456 ArgGln Ala Glu Lys Gly Asn Tyr Lys Asp Gly His Trp Lys Ile Lys 1140 11451150 atc aaa gac aag aga ctt aag act tgt ttc gaa gat ttc tgc ttt tgg3504 Ile Lys Asp Lys Arg Leu Lys Thr Cys Phe Glu Asp Phe Cys Phe Trp1155 1160 1165 gag agt atg ctt tgg cat tgg ggt gag acg aac tct acc aacaat tct 3552 Glu Ser Met Leu Trp His Trp Gly Glu Thr Asn Ser Thr Asn AsnSer 1170 1175 1180 tcc acc acc acc act tca tca ccg ccg cat aaa acc gctctc cct tcc 3600 Ser Thr Thr Thr Thr Ser Ser Pro Pro His Lys Thr Ala LeuPro Ser 1185 1190 1195 1200 ctg tga 3606 Leu 7 1201 PRT Arabidopsisthaliana 7 Met Cys Val Asn Phe Ser Ser Leu Lys Leu Val Leu Phe Leu MetMet 1 5 10 15 Leu Val Ala Met Phe Thr Leu Tyr Cys Ser Pro Pro Leu GlnIle Pro 20 25 30 Glu Asp Pro Ser Ser Phe Ala Asn Lys Trp Ile Leu Glu ProAla Val 35 40 45 Thr Thr Asp Pro Arg Tyr Ile Ala Thr Ser Glu Ile Asn TrpAsn Ser 50 55 60 Met Ser Leu Val Val Glu His Tyr Leu Ser Gly Arg Ser GluTyr Gln 65 70 75 80 Gly Ile Gly Phe Leu Asn Leu Asn Asp Asn Glu Ile AsnArg Trp Gln 85 90 95 Val Val Ile Lys Ser His Cys Gln His Ile Ala Leu HisLeu Asp His 100 105 110 Ala Ala Ser Asn Ile Thr Trp Lys Ser Leu Tyr ProGlu Trp Ile Asp 115 120 125 Glu Glu Glu Lys Phe Lys Val Pro Thr Cys ProSer Leu Pro Trp Ile 130 135 140 Gln Val Pro Asp Lys Ser Arg Ile Asp LeuIle Ile Ala Lys Leu Pro 145 150 155 160 Cys Asn Lys Ser Gly Lys Trp SerArg Asp Val Ala Arg Leu His Leu 165 170 175 Gln Leu Ala Ala Ala Arg ValAla Ala Ser Ser Glu Gly Leu His Asp 180 185 190 Val His Val Ile Leu ValSer Asp Cys Phe Pro Ile Pro Asn Leu Phe 195 200 205 Thr Gly Gln Glu LeuVal Ala Arg Gln Gly Asn Ile Trp Leu Tyr Lys 210 215 220 Pro Lys Leu HisGln Leu Arg Gln Lys Leu Gln Leu Pro Val Gly Ser 225 230 235 240 Cys GluLeu Ser Val Pro Leu Gln Ala Lys Asp Asn Phe Tyr Ser Ala 245 250 255 AsnAla Lys Lys Glu Ala Tyr Ala Thr Ile Leu His Ser Asp Asp Ala 260 265 270Phe Val Cys Gly Ala Ile Ala Val Ala Gln Ser Ile Arg Met Ser Gly 275 280285 Ser Thr Arg Asn Leu Val Ile Leu Val Asp Asp Ser Ile Ser Glu Tyr 290295 300 His Arg Ser Gly Leu Glu Ser Ala Gly Trp Lys Ile His Thr Phe Gln305 310 315 320 Arg Ile Arg Asn Pro Lys Ala Glu Ala Asn Ala Tyr Asn GlnTrp Asn 325 330 335 Tyr Ser Lys Phe Arg Leu Trp Glu Leu Thr Glu Tyr AsnLys Ile Ile 340 345 350 Phe Ile Asp Ala Asp Met Leu Ile Leu Arg Asn MetAsp Phe Leu Phe 355 360 365 Glu Tyr Pro Glu Ile Ser Thr Thr Gly Asn AspGly Thr Leu Phe Asn 370 375 380 Ser Gly Leu Met Val Ile Glu Pro Ser AsnSer Thr Phe Gln Leu Leu 385 390 395 400 Met Asp His Ile Asn Asp Ile AsnSer Tyr Asn Gly Gly Asp Gln Gly 405 410 415 Tyr Leu Asn Glu Ile Phe ThrTrp Trp His Arg Ile Pro Lys His Met 420 425 430 Asn Phe Leu Lys His PheTrp Glu Gly Asp Thr Pro Lys His Arg Lys 435 440 445 Ser Lys Thr Arg LeuPhe Gly Ala Asp Pro Pro Ile Leu Tyr Val Leu 450 455 460 His Tyr Leu GlyTyr Asn Lys Pro Trp Val Cys Phe Arg Asp Tyr Asp 465 470 475 480 Cys AsnTrp Asn Val Val Gly Tyr His Gln Phe Ala Ser Asp Glu Ala 485 490 495 HisLys Thr Trp Trp Arg Val His Asp Ala Met Pro Lys Lys Leu Gln 500 505 510Arg Phe Cys Leu Leu Ser Ser Lys Gln Lys Ala Gln Leu Glu Trp Asp 515 520525 Arg Arg Gln Ala Glu Lys Ala Asn Tyr Arg Asp Gly His Trp Arg Ile 530535 540 Lys Ile Lys Asp Lys Arg Leu Thr Thr Cys Phe Glu Asp Phe Cys Phe545 550 555 560 Trp Glu Ser Met Leu Trp His Trp Gly Asp Tyr Glu Ile LeuGlu Thr 565 570 575 Asp Pro Gly Leu Thr Glu Thr Met Ile Pro Ser Ser SerPro Met Glu 580 585 590 Ser Arg His Arg Leu Ser Phe Ser Asn Glu Lys ThrSer Arg Arg Arg 595 600 605 Phe Gln Arg Ile Glu Lys Gly Val Lys Phe AsnThr Leu Lys Leu Val 610 615 620 Leu Ile Cys Ile Met Leu Gly Ala Leu PheThr Ile Tyr Arg Phe Arg 625 630 635 640 Tyr Pro Pro Leu Gln Ile Pro GluIle Pro Thr Ser Phe Gly Leu Thr 645 650 655 Thr Asp Pro Arg Tyr Val AlaThr Ala Glu Ile Asn Trp Asn His Met 660 665 670 Ser Asn Leu Val Glu LysHis Val Phe Gly Arg Ser Glu Tyr Gln Gly 675 680 685 Ile Gly Leu Ile AsnLeu Asn Asp Asn Glu Ile Asp Arg Phe Lys Glu 690 695 700 Val Thr Lys SerAsp Cys Asp His Val Ala Leu His Leu Asp Tyr Ala 705 710 715 720 Ala LysAsn Ile Thr Trp Glu Ser Leu Tyr Pro Glu Trp Ile Asp Glu 725 730 735 ValGlu Glu Phe Glu Val Pro Thr Cys Pro Ser Leu Pro Leu Ile Gln 740 745 750Ile Pro Gly Lys Pro Arg Ile Asp Leu Val Ile Ala Lys Leu Pro Cys 755 760765 Asp Lys Ser Gly Lys Trp Ser Arg Asp Val Ala Arg Leu His Leu Gln 770775 780 Leu Ala Ala Ala Arg Val Ala Ala Ser Ser Lys Gly Leu His Asn Val785 790 795 800 His Val Ile Leu Val Ser Asp Cys Phe Pro Ile Pro Asn LeuPhe Thr 805 810 815 Gly Gln Glu Leu Val Ala Arg Gln Gly Asn Ile Trp LeuTyr Lys Pro 820 825 830 Asn Leu His Gln Leu Arg Gln Lys Leu Gln Leu ProVal Gly Ser Cys 835 840 845 Glu Leu Ser Val Pro Leu Gln Ala Lys Asp AsnPhe Tyr Ser Ala Gly 850 855 860 Ala Lys Lys Glu Ala Tyr Ala Thr Ile LeuHis Ser Ala Gln Phe Tyr 865 870 875 880 Val Cys Gly Ala Ile Ala Ala AlaGln Ser Ile Arg Met Ser Gly Ser 885 890 895 Thr Arg Asp Leu Val Ile LeuVal Asp Glu Thr Ile Ser Glu Tyr His 900 905 910 Lys Ser Gly Leu Val AlaAla Gly Trp Lys Ile Gln Met Phe Gln Arg 915 920 925 Ile Arg Asn Pro AsnAla Val Pro Asn Ala Tyr Asn Glu Trp Asn Tyr 930 935 940 Ser Lys Phe ArgLeu Trp Gln Leu Thr Glu Tyr Ser Lys Ile Ile Phe 945 950 955 960 Ile AspAla Asp Met Leu Ile Leu Arg Asn Ile Asp Phe Leu Phe Glu 965 970 975 PhePro Glu Ile Ser Ala Thr Gly Asn Asn Ala Thr Leu Phe Asn Ser 980 985 990Gly Leu Met Val Val Glu Pro Ser Asn Ser Thr Phe Gln Leu Leu Met 995 10001005 Asp Asn Ile Asn Glu Val Val Ser Tyr Asn Gly Gly Asp Gln Gly Tyr1010 1015 1020 Leu Asn Glu Ile Phe Thr Trp Trp His Arg Ile Pro Lys HisMet Asn 1025 1030 1035 1040 Phe Leu Lys His Phe Trp Glu Gly Asp Glu ProGlu Ile Lys Lys Met 1045 1050 1055 Lys Thr Ser Leu Phe Gly Ala Asp ProPro Ile Leu Tyr Val Leu His 1060 1065 1070 Tyr Leu Gly Tyr Asn Lys ProTrp Leu Cys Phe Arg Asp Tyr Asp Cys 1075 1080 1085 Asn Trp Asn Val AspIle Phe Gln Glu Phe Ala Ser Asp Glu Ala His 1090 1095 1100 Lys Thr TrpTrp Arg Val His Asp Ala Met Pro Glu Asn Leu His Lys 1105 1110 1115 1120Phe Cys Leu Leu Arg Ser Lys Gln Lys Ala Gln Leu Glu Trp Asp Arg 11251130 1135 Arg Gln Ala Glu Lys Gly Asn Tyr Lys Asp Gly His Trp Lys IleLys 1140 1145 1150 Ile Lys Asp Lys Arg Leu Lys Thr Cys Phe Glu Asp PheCys Phe Trp 1155 1160 1165 Glu Ser Met Leu Trp His Trp Gly Glu Thr AsnSer Thr Asn Asn Ser 1170 1175 1180 Ser Thr Thr Thr Thr Ser Ser Pro ProHis Lys Thr Ala Leu Pro Ser 1185 1190 1195 1200 Leu 8 1653 DNAArabidopsis thaliana CDS (1)..(1653) 8 atg ggg gcc aaa agc aaa agt tcgagt acg aga ttt ttt atg ttt tat 48 Met Gly Ala Lys Ser Lys Ser Ser SerThr Arg Phe Phe Met Phe Tyr 1 5 10 15 ctt ata cta ata tca ttg tcg tttttg ggt ttg ctc tta aac ttt aaa 96 Leu Ile Leu Ile Ser Leu Ser Phe LeuGly Leu Leu Leu Asn Phe Lys 20 25 30 cct ctg ttt ctg ctc aac ccc atg atcgct tct cct tcg ata gtt gag 144 Pro Leu Phe Leu Leu Asn Pro Met Ile AlaSer Pro Ser Ile Val Glu 35 40 45 att cgt tat tct ttg ccg gaa ccg gtt aaacgg act ccg ata tgg ctc 192 Ile Arg Tyr Ser Leu Pro Glu Pro Val Lys ArgThr Pro Ile Trp Leu 50 55 60 cga ctc att aga aac tat ctt ccg gat gag aaaaag atc cga gtg ggt 240 Arg Leu Ile Arg Asn Tyr Leu Pro Asp Glu Lys LysIle Arg Val Gly 65 70 75 80 ctt ctc aac atc gca gag aac gag cga gag agctac gag gca agc ggg 288 Leu Leu Asn Ile Ala Glu Asn Glu Arg Glu Ser TyrGlu Ala Ser Gly 85 90 95 acg tcg atc ttg gag aat gtc cac gtg tcg ctc gatcct ctt ccg aac 336 Thr Ser Ile Leu Glu Asn Val His Val Ser Leu Asp ProLeu Pro Asn 100 105 110 aat ctg aca tgg acg agt tta ttc ccg gtt tgg atcgac gag gat cac 384 Asn Leu Thr Trp Thr Ser Leu Phe Pro Val Trp Ile AspGlu Asp His 115 120 125 acg tgg cac att cct agt tgt cca gaa gtc cct ctccct aag atg gaa 432 Thr Trp His Ile Pro Ser Cys Pro Glu Val Pro Leu ProLys Met Glu 130 135 140 ggt tcc gaa gct gac gtg gac gtc gtc gtt gtc aaagtc ccg tgc gat 480 Gly Ser Glu Ala Asp Val Asp Val Val Val Val Lys ValPro Cys Asp 145 150 155 160 ggt ttc tcg gag aag aga ggg tta aga gac gttttc agg cta cag gtg 528 Gly Phe Ser Glu Lys Arg Gly Leu Arg Asp Val PheArg Leu Gln Val 165 170 175 aat ctg gcg gca gcg aat ctt gtg gtg gag agtggt cgg agg aat gtt 576 Asn Leu Ala Ala Ala Asn Leu Val Val Glu Ser GlyArg Arg Asn Val 180 185 190 gat cgg act gtg tac gtt gtc ttc atc gga tcttgt ggg cct atg cat 624 Asp Arg Thr Val Tyr Val Val Phe Ile Gly Ser CysGly Pro Met His 195 200 205 gag atc ttt agg tgt gat gag cgc gtg aag cgcgtg ggg gac tat tgg 672 Glu Ile Phe Arg Cys Asp Glu Arg Val Lys Arg ValGly Asp Tyr Trp 210 215 220 gtc tat agg cct gat ctt acg agg ttg aag cagaag ctt ctc atg cct 720 Val Tyr Arg Pro Asp Leu Thr Arg Leu Lys Gln LysLeu Leu Met Pro 225 230 235 240 cct ggt tca tgt cag att gct ccg cta ggtcaa gga gaa gca tgg ata 768 Pro Gly Ser Cys Gln Ile Ala Pro Leu Gly GlnGly Glu Ala Trp Ile 245 250 255 caa gac aag aac aga aat ctc aca tcc gaaaaa act aca tta tca tca 816 Gln Asp Lys Asn Arg Asn Leu Thr Ser Glu LysThr Thr Leu Ser Ser 260 265 270 ttt act gcc caa cgt gtc gct tac gtg acgtta cta cac tca tcg gag 864 Phe Thr Ala Gln Arg Val Ala Tyr Val Thr LeuLeu His Ser Ser Glu 275 280 285 gta tac gta tgc gga gca ata gcc tta gcacaa agc ata agg caa tct 912 Val Tyr Val Cys Gly Ala Ile Ala Leu Ala GlnSer Ile Arg Gln Ser 290 295 300 gga tca acc aag gac atg att ctc ctc cacgat gac tct ata acc aac 960 Gly Ser Thr Lys Asp Met Ile Leu Leu His AspAsp Ser Ile Thr Asn 305 310 315 320 atc tct ctc att ggc cta agc ctt gctggc tgg aaa cta cgg cga gtg 1008 Ile Ser Leu Ile Gly Leu Ser Leu Ala GlyTrp Lys Leu Arg Arg Val 325 330 335 gag aga att cgt agt cct ttt tcc aagaag cgt tct tac aat gag tgg 1056 Glu Arg Ile Arg Ser Pro Phe Ser Lys LysArg Ser Tyr Asn Glu Trp 340 345 350 aac tac agt aag tta cgt gtg tgg caagtg aca gat tac gac aaa cta 1104 Asn Tyr Ser Lys Leu Arg Val Trp Gln ValThr Asp Tyr Asp Lys Leu 355 360 365 gtg ttt ata gac gca gac ttc atc atcgtc aag aat att gat tac ctt 1152 Val Phe Ile Asp Ala Asp Phe Ile Ile ValLys Asn Ile Asp Tyr Leu 370 375 380 ttc tcc tat cct caa ctt tct gcc gctggc aat aac aaa gtc ttg ttc 1200 Phe Ser Tyr Pro Gln Leu Ser Ala Ala GlyAsn Asn Lys Val Leu Phe 385 390 395 400 aac tca gga gtc atg gtt ctg gagcca tca gct tgt tta ttc gag gat 1248 Asn Ser Gly Val Met Val Leu Glu ProSer Ala Cys Leu Phe Glu Asp 405 410 415 ttg atg ctt aaa tca ttc aag atcggg tca tac aac ggg gga gac caa 1296 Leu Met Leu Lys Ser Phe Lys Ile GlySer Tyr Asn Gly Gly Asp Gln 420 425 430 gga ttt ctg aac gaa tat ttc gtgtgg tgg cat agg cat gat aaa gcg 1344 Gly Phe Leu Asn Glu Tyr Phe Val TrpTrp His Arg His Asp Lys Ala 435 440 445 cgc aat ctt cca gaa aat tta gagggc ata cac tac ttg gga cta aaa 1392 Arg Asn Leu Pro Glu Asn Leu Glu GlyIle His Tyr Leu Gly Leu Lys 450 455 460 cca tgg cga tgt tac aga gac tacgat tgt aac tgg gac ttg aaa acg 1440 Pro Trp Arg Cys Tyr Arg Asp Tyr AspCys Asn Trp Asp Leu Lys Thr 465 470 475 480 cga cgt gtg tat gca agc gagtcg gtg cat gcg aga tgg tgg aaa gtg 1488 Arg Arg Val Tyr Ala Ser Glu SerVal His Ala Arg Trp Trp Lys Val 485 490 495 tac gac aag atg cct aag aagctg aaa ggt tat tgt ggt ttg aat ctt 1536 Tyr Asp Lys Met Pro Lys Lys LeuLys Gly Tyr Cys Gly Leu Asn Leu 500 505 510 aag atg gag aag aac gtt gagaag tgg agg aaa atg gct aag ctc aat 1584 Lys Met Glu Lys Asn Val Glu LysTrp Arg Lys Met Ala Lys Leu Asn 515 520 525 ggt ttt cct gaa aat cat tggaaa att aga ata aaa gat cct agg aag 1632 Gly Phe Pro Glu Asn His Trp LysIle Arg Ile Lys Asp Pro Arg Lys 530 535 540 aag aac cgt cta agt caa tga1653 Lys Asn Arg Leu Ser Gln 545 550 9 550 PRT Arabidopsis thaliana 9Met Gly Ala Lys Ser Lys Ser Ser Ser Thr Arg Phe Phe Met Phe Tyr 1 5 1015 Leu Ile Leu Ile Ser Leu Ser Phe Leu Gly Leu Leu Leu Asn Phe Lys 20 2530 Pro Leu Phe Leu Leu Asn Pro Met Ile Ala Ser Pro Ser Ile Val Glu 35 4045 Ile Arg Tyr Ser Leu Pro Glu Pro Val Lys Arg Thr Pro Ile Trp Leu 50 5560 Arg Leu Ile Arg Asn Tyr Leu Pro Asp Glu Lys Lys Ile Arg Val Gly 65 7075 80 Leu Leu Asn Ile Ala Glu Asn Glu Arg Glu Ser Tyr Glu Ala Ser Gly 8590 95 Thr Ser Ile Leu Glu Asn Val His Val Ser Leu Asp Pro Leu Pro Asn100 105 110 Asn Leu Thr Trp Thr Ser Leu Phe Pro Val Trp Ile Asp Glu AspHis 115 120 125 Thr Trp His Ile Pro Ser Cys Pro Glu Val Pro Leu Pro LysMet Glu 130 135 140 Gly Ser Glu Ala Asp Val Asp Val Val Val Val Lys ValPro Cys Asp 145 150 155 160 Gly Phe Ser Glu Lys Arg Gly Leu Arg Asp ValPhe Arg Leu Gln Val 165 170 175 Asn Leu Ala Ala Ala Asn Leu Val Val GluSer Gly Arg Arg Asn Val 180 185 190 Asp Arg Thr Val Tyr Val Val Phe IleGly Ser Cys Gly Pro Met His 195 200 205 Glu Ile Phe Arg Cys Asp Glu ArgVal Lys Arg Val Gly Asp Tyr Trp 210 215 220 Val Tyr Arg Pro Asp Leu ThrArg Leu Lys Gln Lys Leu Leu Met Pro 225 230 235 240 Pro Gly Ser Cys GlnIle Ala Pro Leu Gly Gln Gly Glu Ala Trp Ile 245 250 255 Gln Asp Lys AsnArg Asn Leu Thr Ser Glu Lys Thr Thr Leu Ser Ser 260 265 270 Phe Thr AlaGln Arg Val Ala Tyr Val Thr Leu Leu His Ser Ser Glu 275 280 285 Val TyrVal Cys Gly Ala Ile Ala Leu Ala Gln Ser Ile Arg Gln Ser 290 295 300 GlySer Thr Lys Asp Met Ile Leu Leu His Asp Asp Ser Ile Thr Asn 305 310 315320 Ile Ser Leu Ile Gly Leu Ser Leu Ala Gly Trp Lys Leu Arg Arg Val 325330 335 Glu Arg Ile Arg Ser Pro Phe Ser Lys Lys Arg Ser Tyr Asn Glu Trp340 345 350 Asn Tyr Ser Lys Leu Arg Val Trp Gln Val Thr Asp Tyr Asp LysLeu 355 360 365 Val Phe Ile Asp Ala Asp Phe Ile Ile Val Lys Asn Ile AspTyr Leu 370 375 380 Phe Ser Tyr Pro Gln Leu Ser Ala Ala Gly Asn Asn LysVal Leu Phe 385 390 395 400 Asn Ser Gly Val Met Val Leu Glu Pro Ser AlaCys Leu Phe Glu Asp 405 410 415 Leu Met Leu Lys Ser Phe Lys Ile Gly SerTyr Asn Gly Gly Asp Gln 420 425 430 Gly Phe Leu Asn Glu Tyr Phe Val TrpTrp His Arg His Asp Lys Ala 435 440 445 Arg Asn Leu Pro Glu Asn Leu GluGly Ile His Tyr Leu Gly Leu Lys 450 455 460 Pro Trp Arg Cys Tyr Arg AspTyr Asp Cys Asn Trp Asp Leu Lys Thr 465 470 475 480 Arg Arg Val Tyr AlaSer Glu Ser Val His Ala Arg Trp Trp Lys Val 485 490 495 Tyr Asp Lys MetPro Lys Lys Leu Lys Gly Tyr Cys Gly Leu Asn Leu 500 505 510 Lys Met GluLys Asn Val Glu Lys Trp Arg Lys Met Ala Lys Leu Asn 515 520 525 Gly PhePro Glu Asn His Trp Lys Ile Arg Ile Lys Asp Pro Arg Lys 530 535 540 LysAsn Arg Leu Ser Gln 545 550 10 1674 DNA Arabidopsis thaliana CDS(1)..(1674) 10 atg ggg aca aaa acc cat aat tct aga ggg aaa atc ttc atgatc tat 48 Met Gly Thr Lys Thr His Asn Ser Arg Gly Lys Ile Phe Met IleTyr 1 5 10 15 cta atc cta gtc tca ttg tca ctt cta ggt ttg atc tta cctttt aaa 96 Leu Ile Leu Val Ser Leu Ser Leu Leu Gly Leu Ile Leu Pro PheLys 20 25 30 cct ctt ttc cgg att act tct cca tct tca acg tta cgg att gatctt 144 Pro Leu Phe Arg Ile Thr Ser Pro Ser Ser Thr Leu Arg Ile Asp Leu35 40 45 cca tcg ccg caa gtc aac aaa aac ccg aaa tgg ctt cga ctc atc cgt192 Pro Ser Pro Gln Val Asn Lys Asn Pro Lys Trp Leu Arg Leu Ile Arg 5055 60 aac tat cta cca gag aaa aga atc caa gtc ggc ttc ctt aac ata gac240 Asn Tyr Leu Pro Glu Lys Arg Ile Gln Val Gly Phe Leu Asn Ile Asp 6570 75 80 gag aaa gag cgt gag agc tac gag gct cgt gga ccg ttg gta ctt aag288 Glu Lys Glu Arg Glu Ser Tyr Glu Ala Arg Gly Pro Leu Val Leu Lys 8590 95 aac atc cac gtg ccg ctt gat cat ata ccc aag aat gtc act tgg aag336 Asn Ile His Val Pro Leu Asp His Ile Pro Lys Asn Val Thr Trp Lys 100105 110 agt ctt tac ccg gag tgg atc aac gag gaa gct tct acc tgt ccg gag384 Ser Leu Tyr Pro Glu Trp Ile Asn Glu Glu Ala Ser Thr Cys Pro Glu 115120 125 atc cct ctc cct cag cca gaa ggt tct gat gct aac gtg gac gtt att432 Ile Pro Leu Pro Gln Pro Glu Gly Ser Asp Ala Asn Val Asp Val Ile 130135 140 gtt gct aga gtt cca tgt gat ggt tgg tcg gcg aat aaa ggg ctt agg480 Val Ala Arg Val Pro Cys Asp Gly Trp Ser Ala Asn Lys Gly Leu Arg 145150 155 160 gac gtt ttt agg ctt cag gtt aat ttg gcc gca gcg aat cta gccgtc 528 Asp Val Phe Arg Leu Gln Val Asn Leu Ala Ala Ala Asn Leu Ala Val165 170 175 caa agt ggg ttg agg acg gtt aat cag gcg gtc tac gtt gta ttcatc 576 Gln Ser Gly Leu Arg Thr Val Asn Gln Ala Val Tyr Val Val Phe Ile180 185 190 ggc tca tgt ggg cct atg cat gag att ttc ccg tgc gat gag cgcgtg 624 Gly Ser Cys Gly Pro Met His Glu Ile Phe Pro Cys Asp Glu Arg Val195 200 205 atg cgc gtg gag gat tat tgg gtg tat aag cct tat ctc cca aggttg 672 Met Arg Val Glu Asp Tyr Trp Val Tyr Lys Pro Tyr Leu Pro Arg Leu210 215 220 aag cag aag ctt ctc atg cct gtt ggt tca tgt cag att gct ccttca 720 Lys Gln Lys Leu Leu Met Pro Val Gly Ser Cys Gln Ile Ala Pro Ser225 230 235 240 ttt gct caa ttt ggt caa gaa gca tgg aga cca aaa cat gaagat aat 768 Phe Ala Gln Phe Gly Gln Glu Ala Trp Arg Pro Lys His Glu AspAsn 245 250 255 ctt gca tca aag gca gtc aca gcc tta ccc cgt cgc tta cgggtt gcc 816 Leu Ala Ser Lys Ala Val Thr Ala Leu Pro Arg Arg Leu Arg ValAla 260 265 270 tac gtg aca gta cta cac tcg tca gaa gcc tat gtt tgt ggggca ata 864 Tyr Val Thr Val Leu His Ser Ser Glu Ala Tyr Val Cys Gly AlaIle 275 280 285 gct tta gcg caa agt ata aga caa tca gga tcg cat aag gacatg att 912 Ala Leu Ala Gln Ser Ile Arg Gln Ser Gly Ser His Lys Asp MetIle 290 295 300 ctc ctc cat gat cat acc ata acc aac aag tct ctt att ggtctc agc 960 Leu Leu His Asp His Thr Ile Thr Asn Lys Ser Leu Ile Gly LeuSer 305 310 315 320 gct gcg gga tgg aat ctc cgg cta atc gac agg atc cgcagt cct ttt 1008 Ala Ala Gly Trp Asn Leu Arg Leu Ile Asp Arg Ile Arg SerPro Phe 325 330 335 tcg caa aaa gac tct tat aat gag tgg aac tat agc aaatta cgt gtg 1056 Ser Gln Lys Asp Ser Tyr Asn Glu Trp Asn Tyr Ser Lys LeuArg Val 340 345 350 tgg caa gta act gac tac gat aaa ctt gtg ttc ata gacgca gat ttc 1104 Trp Gln Val Thr Asp Tyr Asp Lys Leu Val Phe Ile Asp AlaAsp Phe 355 360 365 atc atc ctc aag aaa ctt gat cat ctc ttc tac tat ccacaa ctc tca 1152 Ile Ile Leu Lys Lys Leu Asp His Leu Phe Tyr Tyr Pro GlnLeu Ser 370 375 380 gct tca ggc aac gac aaa gtg tta ttc aac tcc gga atcatg gtt ctc 1200 Ala Ser Gly Asn Asp Lys Val Leu Phe Asn Ser Gly Ile MetVal Leu 385 390 395 400 gag cca tcg gca tgt atg ttt aaa gat tta atg gagaaa tcg ttc aag 1248 Glu Pro Ser Ala Cys Met Phe Lys Asp Leu Met Glu LysSer Phe Lys 405 410 415 att gag tca tac aac gga gga gac caa gga ttc cttaat gag ata ttt 1296 Ile Glu Ser Tyr Asn Gly Gly Asp Gln Gly Phe Leu AsnGlu Ile Phe 420 425 430 gta tgg tgg cac agg tta tcg aaa cga gtg aac acaatg aag tac ttc 1344 Val Trp Trp His Arg Leu Ser Lys Arg Val Asn Thr MetLys Tyr Phe 435 440 445 gac gaa aaa aat cat cga aga cac gat ctt cct gagaat gta gaa ggt 1392 Asp Glu Lys Asn His Arg Arg His Asp Leu Pro Glu AsnVal Glu Gly 450 455 460 ctg cac tac ttg ggg ttg aaa cca tgg gta tgt tataga gac tat gat 1440 Leu His Tyr Leu Gly Leu Lys Pro Trp Val Cys Tyr ArgAsp Tyr Asp 465 470 475 480 tgc aat tgg gac att agc gaa cga cgc gtg tttgca agc gat tct gtg 1488 Cys Asn Trp Asp Ile Ser Glu Arg Arg Val Phe AlaSer Asp Ser Val 485 490 495 cac gaa aaa tgg tgg aaa gtg tat gac aaa atgtca gag cag ttg aaa 1536 His Glu Lys Trp Trp Lys Val Tyr Asp Lys Met SerGlu Gln Leu Lys 500 505 510 ggt tat tgt ggt ttg aat aag aat atg gag aagagg att gag aag tgg 1584 Gly Tyr Cys Gly Leu Asn Lys Asn Met Glu Lys ArgIle Glu Lys Trp 515 520 525 aga aga atc gct aag aac aat agt ttg cct gatagg cat tgg gag att 1632 Arg Arg Ile Ala Lys Asn Asn Ser Leu Pro Asp ArgHis Trp Glu Ile 530 535 540 gaa gtg aga gat cct agg aag acg aat ctt cttgtt cag tga 1674 Glu Val Arg Asp Pro Arg Lys Thr Asn Leu Leu Val Gln 545550 555 11 557 PRT Arabidopsis thaliana 11 Met Gly Thr Lys Thr His AsnSer Arg Gly Lys Ile Phe Met Ile Tyr 1 5 10 15 Leu Ile Leu Val Ser LeuSer Leu Leu Gly Leu Ile Leu Pro Phe Lys 20 25 30 Pro Leu Phe Arg Ile ThrSer Pro Ser Ser Thr Leu Arg Ile Asp Leu 35 40 45 Pro Ser Pro Gln Val AsnLys Asn Pro Lys Trp Leu Arg Leu Ile Arg 50 55 60 Asn Tyr Leu Pro Glu LysArg Ile Gln Val Gly Phe Leu Asn Ile Asp 65 70 75 80 Glu Lys Glu Arg GluSer Tyr Glu Ala Arg Gly Pro Leu Val Leu Lys 85 90 95 Asn Ile His Val ProLeu Asp His Ile Pro Lys Asn Val Thr Trp Lys 100 105 110 Ser Leu Tyr ProGlu Trp Ile Asn Glu Glu Ala Ser Thr Cys Pro Glu 115 120 125 Ile Pro LeuPro Gln Pro Glu Gly Ser Asp Ala Asn Val Asp Val Ile 130 135 140 Val AlaArg Val Pro Cys Asp Gly Trp Ser Ala Asn Lys Gly Leu Arg 145 150 155 160Asp Val Phe Arg Leu Gln Val Asn Leu Ala Ala Ala Asn Leu Ala Val 165 170175 Gln Ser Gly Leu Arg Thr Val Asn Gln Ala Val Tyr Val Val Phe Ile 180185 190 Gly Ser Cys Gly Pro Met His Glu Ile Phe Pro Cys Asp Glu Arg Val195 200 205 Met Arg Val Glu Asp Tyr Trp Val Tyr Lys Pro Tyr Leu Pro ArgLeu 210 215 220 Lys Gln Lys Leu Leu Met Pro Val Gly Ser Cys Gln Ile AlaPro Ser 225 230 235 240 Phe Ala Gln Phe Gly Gln Glu Ala Trp Arg Pro LysHis Glu Asp Asn 245 250 255 Leu Ala Ser Lys Ala Val Thr Ala Leu Pro ArgArg Leu Arg Val Ala 260 265 270 Tyr Val Thr Val Leu His Ser Ser Glu AlaTyr Val Cys Gly Ala Ile 275 280 285 Ala Leu Ala Gln Ser Ile Arg Gln SerGly Ser His Lys Asp Met Ile 290 295 300 Leu Leu His Asp His Thr Ile ThrAsn Lys Ser Leu Ile Gly Leu Ser 305 310 315 320 Ala Ala Gly Trp Asn LeuArg Leu Ile Asp Arg Ile Arg Ser Pro Phe 325 330 335 Ser Gln Lys Asp SerTyr Asn Glu Trp Asn Tyr Ser Lys Leu Arg Val 340 345 350 Trp Gln Val ThrAsp Tyr Asp Lys Leu Val Phe Ile Asp Ala Asp Phe 355 360 365 Ile Ile LeuLys Lys Leu Asp His Leu Phe Tyr Tyr Pro Gln Leu Ser 370 375 380 Ala SerGly Asn Asp Lys Val Leu Phe Asn Ser Gly Ile Met Val Leu 385 390 395 400Glu Pro Ser Ala Cys Met Phe Lys Asp Leu Met Glu Lys Ser Phe Lys 405 410415 Ile Glu Ser Tyr Asn Gly Gly Asp Gln Gly Phe Leu Asn Glu Ile Phe 420425 430 Val Trp Trp His Arg Leu Ser Lys Arg Val Asn Thr Met Lys Tyr Phe435 440 445 Asp Glu Lys Asn His Arg Arg His Asp Leu Pro Glu Asn Val GluGly 450 455 460 Leu His Tyr Leu Gly Leu Lys Pro Trp Val Cys Tyr Arg AspTyr Asp 465 470 475 480 Cys Asn Trp Asp Ile Ser Glu Arg Arg Val Phe AlaSer Asp Ser Val 485 490 495 His Glu Lys Trp Trp Lys Val Tyr Asp Lys MetSer Glu Gln Leu Lys 500 505 510 Gly Tyr Cys Gly Leu Asn Lys Asn Met GluLys Arg Ile Glu Lys Trp 515 520 525 Arg Arg Ile Ala Lys Asn Asn Ser LeuPro Asp Arg His Trp Glu Ile 530 535 540 Glu Val Arg Asp Pro Arg Lys ThrAsn Leu Leu Val Gln 545 550 555 12 1002 DNA Arabidopsis thaliana CDS(1)..(1002) 12 atg gcc tta cta aat gaa tta atg agt ttt ttt atc caa aaacaa aaa 48 Met Ala Leu Leu Asn Glu Leu Met Ser Phe Phe Ile Gln Lys GlnLys 1 5 10 15 gca ggt gta gac aaa gtg tat gac cta acg aag ata gaa gcagag aca 96 Ala Gly Val Asp Lys Val Tyr Asp Leu Thr Lys Ile Glu Ala GluThr 20 25 30 aaa cga cca aaa cgt gaa gcc tac gta act gtt ctt cac tct tccgag 144 Lys Arg Pro Lys Arg Glu Ala Tyr Val Thr Val Leu His Ser Ser Glu35 40 45 tct tat gtc tgt ggt gcc ata act ttg gct caa agc ctc ctt cag aca192 Ser Tyr Val Cys Gly Ala Ile Thr Leu Ala Gln Ser Leu Leu Gln Thr 5055 60 aac acc aaa cgc gat ctt atc ctt ctc cac gat gac tcc atc tcc att240 Asn Thr Lys Arg Asp Leu Ile Leu Leu His Asp Asp Ser Ile Ser Ile 6570 75 80 acc aaa ctt cga gct ctc gcc gcc gca gga tgg aag ctt cgt cgg atc288 Thr Lys Leu Arg Ala Leu Ala Ala Ala Gly Trp Lys Leu Arg Arg Ile 8590 95 att cga atc aga aac cca ctt gcg gag aag gac tcg tac aat gaa tac336 Ile Arg Ile Arg Asn Pro Leu Ala Glu Lys Asp Ser Tyr Asn Glu Tyr 100105 110 aac tac agc aag ttt cga ctc tgg caa ttg aca gat tac gac aaa gtg384 Asn Tyr Ser Lys Phe Arg Leu Trp Gln Leu Thr Asp Tyr Asp Lys Val 115120 125 atc ttc att gat gcc gac atc atc gtc tta cgt aac ctt gat ctt ctc432 Ile Phe Ile Asp Ala Asp Ile Ile Val Leu Arg Asn Leu Asp Leu Leu 130135 140 ttc cat ttt cct cag atg tcg gcc acc gga aat gat gta tgg ata tat480 Phe His Phe Pro Gln Met Ser Ala Thr Gly Asn Asp Val Trp Ile Tyr 145150 155 160 aat tca ggc atc atg gtc atc gag cct tct aat tgt acg ttt actaca 528 Asn Ser Gly Ile Met Val Ile Glu Pro Ser Asn Cys Thr Phe Thr Thr165 170 175 atc atg agc cag cga agc gag atc gtt tca tac aac ggt gga gatcaa 576 Ile Met Ser Gln Arg Ser Glu Ile Val Ser Tyr Asn Gly Gly Asp Gln180 185 190 ggg tac cta aac gag ata ttt gtg tgg tgg cac cga ttg cct cgacga 624 Gly Tyr Leu Asn Glu Ile Phe Val Trp Trp His Arg Leu Pro Arg Arg195 200 205 gta aac ttt ctg aag aac ttc tgg tcg aac aca acc aaa gaa agaaac 672 Val Asn Phe Leu Lys Asn Phe Trp Ser Asn Thr Thr Lys Glu Arg Asn210 215 220 atc aag aac aac ctc ttc gcc gcg gag ccg cct cag gtc tac gcggtc 720 Ile Lys Asn Asn Leu Phe Ala Ala Glu Pro Pro Gln Val Tyr Ala Val225 230 235 240 cac tac tta ggt tgg aaa cca tgg ctt tgc tat agg gac tacgat tgc 768 His Tyr Leu Gly Trp Lys Pro Trp Leu Cys Tyr Arg Asp Tyr AspCys 245 250 255 aac tac gac gtg gac gag cag ttg gtg tac gct agt gat gcggct cac 816 Asn Tyr Asp Val Asp Glu Gln Leu Val Tyr Ala Ser Asp Ala AlaHis 260 265 270 gtt agg tgg tgg aaa gtg cac gac tcc atg gac gat gca ttgcaa aag 864 Val Arg Trp Trp Lys Val His Asp Ser Met Asp Asp Ala Leu GlnLys 275 280 285 ttt tgc agg ctg acg aaa aag agg aga acg gag atc aac tgggag agg 912 Phe Cys Arg Leu Thr Lys Lys Arg Arg Thr Glu Ile Asn Trp GluArg 290 295 300 agg aaa gca agg ctt aga ggt tcc act gat tat cat tgg aagatc aat 960 Arg Lys Ala Arg Leu Arg Gly Ser Thr Asp Tyr His Trp Lys IleAsn 305 310 315 320 gtc act gat cca aga cga cgt cgt tct tat ttg att ggttaa 1002 Val Thr Asp Pro Arg Arg Arg Arg Ser Tyr Leu Ile Gly 325 330 13333 PRT Arabidopsis thaliana 13 Met Ala Leu Leu Asn Glu Leu Met Ser PhePhe Ile Gln Lys Gln Lys 1 5 10 15 Ala Gly Val Asp Lys Val Tyr Asp LeuThr Lys Ile Glu Ala Glu Thr 20 25 30 Lys Arg Pro Lys Arg Glu Ala Tyr ValThr Val Leu His Ser Ser Glu 35 40 45 Ser Tyr Val Cys Gly Ala Ile Thr LeuAla Gln Ser Leu Leu Gln Thr 50 55 60 Asn Thr Lys Arg Asp Leu Ile Leu LeuHis Asp Asp Ser Ile Ser Ile 65 70 75 80 Thr Lys Leu Arg Ala Leu Ala AlaAla Gly Trp Lys Leu Arg Arg Ile 85 90 95 Ile Arg Ile Arg Asn Pro Leu AlaGlu Lys Asp Ser Tyr Asn Glu Tyr 100 105 110 Asn Tyr Ser Lys Phe Arg LeuTrp Gln Leu Thr Asp Tyr Asp Lys Val 115 120 125 Ile Phe Ile Asp Ala AspIle Ile Val Leu Arg Asn Leu Asp Leu Leu 130 135 140 Phe His Phe Pro GlnMet Ser Ala Thr Gly Asn Asp Val Trp Ile Tyr 145 150 155 160 Asn Ser GlyIle Met Val Ile Glu Pro Ser Asn Cys Thr Phe Thr Thr 165 170 175 Ile MetSer Gln Arg Ser Glu Ile Val Ser Tyr Asn Gly Gly Asp Gln 180 185 190 GlyTyr Leu Asn Glu Ile Phe Val Trp Trp His Arg Leu Pro Arg Arg 195 200 205Val Asn Phe Leu Lys Asn Phe Trp Ser Asn Thr Thr Lys Glu Arg Asn 210 215220 Ile Lys Asn Asn Leu Phe Ala Ala Glu Pro Pro Gln Val Tyr Ala Val 225230 235 240 His Tyr Leu Gly Trp Lys Pro Trp Leu Cys Tyr Arg Asp Tyr AspCys 245 250 255 Asn Tyr Asp Val Asp Glu Gln Leu Val Tyr Ala Ser Asp AlaAla His 260 265 270 Val Arg Trp Trp Lys Val His Asp Ser Met Asp Asp AlaLeu Gln Lys 275 280 285 Phe Cys Arg Leu Thr Lys Lys Arg Arg Thr Glu IleAsn Trp Glu Arg 290 295 300 Arg Lys Ala Arg Leu Arg Gly Ser Thr Asp TyrHis Trp Lys Ile Asn 305 310 315 320 Val Thr Asp Pro Arg Arg Arg Arg SerTyr Leu Ile Gly 325 330 14 834 DNA Arabidopsis thaliana CDS (1)..(834)14 atg gct cct tcc aaa tct gca ctg ata cgc ttt aat cta gtc ttg ttg 48Met Ala Pro Ser Lys Ser Ala Leu Ile Arg Phe Asn Leu Val Leu Leu 1 5 1015 gca gcg gag ctt cct ttg ttg gat gct ctt ttc gtg att gca ctc cca 96Ala Ala Glu Leu Pro Leu Leu Asp Ala Leu Phe Val Ile Ala Leu Pro 20 25 30aga cta ata gat atc ttt ata ctg cta tgt gat cag gtg gtg aga gga 144 ArgLeu Ile Asp Ile Phe Ile Leu Leu Cys Asp Gln Val Val Arg Gly 35 40 45 gtgaag atg caa gaa ctc gtt gaa gag aac gaa ata aac aag aaa gat 192 Val LysMet Gln Glu Leu Val Glu Glu Asn Glu Ile Asn Lys Lys Asp 50 55 60 ttg ctaacc gct agt aac cag aca aag ctg gag gcg cca agc ttc atg 240 Leu Leu ThrAla Ser Asn Gln Thr Lys Leu Glu Ala Pro Ser Phe Met 65 70 75 80 gaa gagatt tta aca aga ggg tta gga aaa aca aag ata ggg atg gtg 288 Glu Glu IleLeu Thr Arg Gly Leu Gly Lys Thr Lys Ile Gly Met Val 85 90 95 aac atg gaagaa tgt gat ctt act aat tgg aaa cgt tat ggc gaa acg 336 Asn Met Glu GluCys Asp Leu Thr Asn Trp Lys Arg Tyr Gly Glu Thr 100 105 110 gtt cac atacat ttt gag cgt gtc tcg aag ctc ttc aaa tgg caa gac 384 Val His Ile HisPhe Glu Arg Val Ser Lys Leu Phe Lys Trp Gln Asp 115 120 125 ttg ttc cccgag tgg ata gat gaa gag gaa gaa acc gag gtt ccc aca 432 Leu Phe Pro GluTrp Ile Asp Glu Glu Glu Glu Thr Glu Val Pro Thr 130 135 140 tgt cct gagata cct atg ccc gat ttc gaa agc tta gag aag ttg gat 480 Cys Pro Glu IlePro Met Pro Asp Phe Glu Ser Leu Glu Lys Leu Asp 145 150 155 160 ttg gtagta gtg aag ttg cct tgt aat tac cct gaa gaa ggg tgg aga 528 Leu Val ValVal Lys Leu Pro Cys Asn Tyr Pro Glu Glu Gly Trp Arg 165 170 175 aga gaggtt ttg agg ttg caa gtg aac cta gtt gcg gct aac ttg gca 576 Arg Glu ValLeu Arg Leu Gln Val Asn Leu Val Ala Ala Asn Leu Ala 180 185 190 gcc aagaaa ggg aag acg gat tgg aga tgg aaa agc aaa gtg ttg ttt 624 Ala Lys LysGly Lys Thr Asp Trp Arg Trp Lys Ser Lys Val Leu Phe 195 200 205 tgg agcaaa tgt caa ccg atg att gag att ttc cgg tgt gat gat ttg 672 Trp Ser LysCys Gln Pro Met Ile Glu Ile Phe Arg Cys Asp Asp Leu 210 215 220 gag aagaga gag gca gat tgg tgg ctg tat cgc cct gag gtg gtt agg 720 Glu Lys ArgGlu Ala Asp Trp Trp Leu Tyr Arg Pro Glu Val Val Arg 225 230 235 240 ttacaa cag aga ctc agt ttg cca gtc gga tct tgc aat ctt gct ctt 768 Leu GlnGln Arg Leu Ser Leu Pro Val Gly Ser Cys Asn Leu Ala Leu 245 250 255 cctttg tgg gca cca caa ggt aaa att act ttc atg caa att aat ctt 816 Pro LeuTrp Ala Pro Gln Gly Lys Ile Thr Phe Met Gln Ile Asn Leu 260 265 270 cttgct aaa tat ttt tag 834 Leu Ala Lys Tyr Phe 275 15 277 PRT Arabidopsisthaliana 15 Met Ala Pro Ser Lys Ser Ala Leu Ile Arg Phe Asn Leu Val LeuLeu 1 5 10 15 Ala Ala Glu Leu Pro Leu Leu Asp Ala Leu Phe Val Ile AlaLeu Pro 20 25 30 Arg Leu Ile Asp Ile Phe Ile Leu Leu Cys Asp Gln Val ValArg Gly 35 40 45 Val Lys Met Gln Glu Leu Val Glu Glu Asn Glu Ile Asn LysLys Asp 50 55 60 Leu Leu Thr Ala Ser Asn Gln Thr Lys Leu Glu Ala Pro SerPhe Met 65 70 75 80 Glu Glu Ile Leu Thr Arg Gly Leu Gly Lys Thr Lys IleGly Met Val 85 90 95 Asn Met Glu Glu Cys Asp Leu Thr Asn Trp Lys Arg TyrGly Glu Thr 100 105 110 Val His Ile His Phe Glu Arg Val Ser Lys Leu PheLys Trp Gln Asp 115 120 125 Leu Phe Pro Glu Trp Ile Asp Glu Glu Glu GluThr Glu Val Pro Thr 130 135 140 Cys Pro Glu Ile Pro Met Pro Asp Phe GluSer Leu Glu Lys Leu Asp 145 150 155 160 Leu Val Val Val Lys Leu Pro CysAsn Tyr Pro Glu Glu Gly Trp Arg 165 170 175 Arg Glu Val Leu Arg Leu GlnVal Asn Leu Val Ala Ala Asn Leu Ala 180 185 190 Ala Lys Lys Gly Lys ThrAsp Trp Arg Trp Lys Ser Lys Val Leu Phe 195 200 205 Trp Ser Lys Cys GlnPro Met Ile Glu Ile Phe Arg Cys Asp Asp Leu 210 215 220 Glu Lys Arg GluAla Asp Trp Trp Leu Tyr Arg Pro Glu Val Val Arg 225 230 235 240 Leu GlnGln Arg Leu Ser Leu Pro Val Gly Ser Cys Asn Leu Ala Leu 245 250 255 ProLeu Trp Ala Pro Gln Gly Lys Ile Thr Phe Met Gln Ile Asn Leu 260 265 270Leu Ala Lys Tyr Phe 275 16 383 DNA Hordeum vulgare CDS (46)..(381) 16ttgaatctgc gggttggaag gtcagaataa ttgagaggat cggaa ccc gaa gcc gag 57 ProGlu Ala Glu 1 cgt gat gct tac aat gag tgg aac tac agc aag ttc cgg ttgtgg cag 105 Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu TrpGln 5 10 15 20 ctc acg gac tat gac aag atc ata ttc ata gat gct gat ctgctc atc 153 Leu Thr Asp Tyr Asp Lys Ile Ile Phe Ile Asp Ala Asp Leu LeuIle 25 30 35 ttg agg aac att gat ttc ctg ttt aca atg cca gaa atc agt gcaacc 201 Leu Arg Asn Ile Asp Phe Leu Phe Thr Met Pro Glu Ile Ser Ala Thr40 45 50 ggc aac aat gca aca ctc ttc aac tct ggt gtc atg gtc atc gaa ccc249 Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met Val Ile Glu Pro 5560 65 tca aac tgc aca ttc cag ctg tta atg gag cac atc aat gag ata aca297 Ser Asn Cys Thr Phe Gln Leu Leu Met Glu His Ile Asn Glu Ile Thr 7075 80 tct tac aat ggt ggt gat cag ggc tac ttg aat gag ata ttc aca tgg345 Ser Tyr Asn Gly Gly Asp Gln Gly Tyr Leu Asn Glu Ile Phe Thr Trp 8590 95 100 tgg cat cgg att ccc aag cac atg aac ttc ctg aag ca 383 Trp HisArg Ile Pro Lys His Met Asn Phe Leu Lys 105 110 17 112 PRT Hordeumvulgare 17 Pro Glu Ala Glu Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser LysPhe 1 5 10 15 Arg Leu Trp Gln Leu Thr Asp Tyr Asp Lys Ile Ile Phe IleAsp Ala 20 25 30 Asp Leu Leu Ile Leu Arg Asn Ile Asp Phe Leu Phe Thr MetPro Glu 35 40 45 Ile Ser Ala Thr Gly Asn Asn Ala Thr Leu Phe Asn Ser GlyVal Met 50 55 60 Val Ile Glu Pro Ser Asn Cys Thr Phe Gln Leu Leu Met GluHis Ile 65 70 75 80 Asn Glu Ile Thr Ser Tyr Asn Gly Gly Asp Gln Gly TyrLeu Asn Glu 85 90 95 Ile Phe Thr Trp Trp His Arg Ile Pro Lys His Met AsnPhe Leu Lys 100 105 110 18 245 DNA Hordeum vulgare CDS (52)..(243) 18cgagcttgaa tctgcgggtt ggcaagtcag aataattgag aggatccgga a ccc gaa 57 ProGlu 1 gcc gag cgt gat gct tac aat gag tgg aac tac agc aag ttc cgg ttg105 Ala Glu Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu 5 1015 tgg cag ctc acg gac tat gac aag atc ata ttc ata gat gct gat ctg 153Trp Gln Leu Thr Asp Tyr Asp Lys Ile Ile Phe Ile Asp Ala Asp Leu 20 25 30ctc atc ttg agg aac att gat ttc ctg ttt aca atg cca gaa atc agt 201 LeuIle Leu Arg Asn Ile Asp Phe Leu Phe Thr Met Pro Glu Ile Ser 35 40 45 50gca aac ggc aac aat gca aca ctc ttc aac tct ggt gtc atg gt 245 Ala AsnGly Asn Asn Ala Thr Leu Phe Asn Ser Gly Val Met 55 60 19 64 PRT Hordeumvulgare 19 Pro Glu Ala Glu Arg Asp Ala Tyr Asn Glu Trp Asn Tyr Ser LysPhe 1 5 10 15 Arg Leu Trp Gln Leu Thr Asp Tyr Asp Lys Ile Ile Phe IleAsp Ala 20 25 30 Asp Leu Leu Ile Leu Arg Asn Ile Asp Phe Leu Phe Thr MetPro Glu 35 40 45 Ile Ser Ala Asn Gly Asn Asn Ala Thr Leu Phe Asn Ser GlyVal Met 50 55 60 20 1284 DNA Triticum aestivum CDS (1)..(1284) Variant272 Xaa = Ochre 20 acg cgt ccg ctc gcc ttc ttc ttc ctc gtt cta cat ggccct cct gct 48 Thr Arg Pro Leu Ala Phe Phe Phe Leu Val Leu His Gly ProPro Ala 1 5 10 15 cca ccc caa gta ctc cca cat cct cga ccg cgg cgc ctcctc tct ggt 96 Pro Pro Gln Val Leu Pro His Pro Arg Pro Arg Arg Leu LeuSer Gly 20 25 30 ccg ctg cac ctt ccg cga cgc ctg ccc gtc cac gtc cca cctctc acg 144 Pro Leu His Leu Pro Arg Arg Leu Pro Val His Val Pro Pro LeuThr 35 40 45 gaa ggt aag ccg gga gga aga tca gtg gcg gcg gcg aac aag gtggtg 192 Glu Gly Lys Pro Gly Gly Arg Ser Val Ala Ala Ala Asn Lys Val Val50 55 60 gcg acg gag cgg atc gtg aac gcg ggg cgc gcg ccg acc atg ttc aac240 Ala Thr Glu Arg Ile Val Asn Ala Gly Arg Ala Pro Thr Met Phe Asn 6570 75 80 gag ctg cgc ggc cgg ctg cgg atg ggc ctg gtg aac atc ggc cgc gac288 Glu Leu Arg Gly Arg Leu Arg Met Gly Leu Val Asn Ile Gly Arg Asp 8590 95 gag ctg ctg gcg ctg ggc gtg gag gga gac gcc gtg ggc gtg gac ttc336 Glu Leu Leu Ala Leu Gly Val Glu Gly Asp Ala Val Gly Val Asp Phe 100105 110 gac cgc gtg tcg gac gtg ttc cgg tgg tca gac ctg ttc ccg gag tgg384 Asp Arg Val Ser Asp Val Phe Arg Trp Ser Asp Leu Phe Pro Glu Trp 115120 125 atc gac gag gag gag gag gac ggc gtc ccc tcc tgc ccg gag atc ccc432 Ile Asp Glu Glu Glu Glu Asp Gly Val Pro Ser Cys Pro Glu Ile Pro 130135 140 atg ccg gac ttc tcc cgg tac gac gac gac ggc gtg gac gtg gtg gtg480 Met Pro Asp Phe Ser Arg Tyr Asp Asp Asp Gly Val Asp Val Val Val 145150 155 160 gcg gcg ctg ccg tgc aac cgg acg gcg gtc cgg ggg tgg aac cgcgac 528 Ala Ala Leu Pro Cys Asn Arg Thr Ala Val Arg Gly Trp Asn Arg Asp165 170 175 gtg ttc agg ctg cag gtg cac ctg gtg gcg gcg cac atg gcg gcgcgg 576 Val Phe Arg Leu Gln Val His Leu Val Ala Ala His Met Ala Ala Arg180 185 190 aag tgg gcg gcg cga cgg cgc cgg ccg ggt gcg cgt ggt gct gcggag 624 Lys Trp Ala Ala Arg Arg Arg Arg Pro Gly Ala Arg Gly Ala Ala Glu195 200 205 cga gtg cga gcc gat gat gga cct gtt ccg gtg cga cga gtc cgtggg 672 Arg Val Arg Ala Asp Asp Gly Pro Val Pro Val Arg Arg Val Arg Gly210 215 220 gcg gga ggg gga ctg gtg gat gta cag cgt cga cgc gcc gcg catgga 720 Ala Gly Gly Gly Leu Val Asp Val Gln Arg Arg Arg Ala Ala His Gly225 230 235 240 gga gaa gct ccg gct gcc cat cgg ctc ctg caa cct cgc cgctgc cgc 768 Gly Glu Ala Pro Ala Ala His Arg Leu Leu Gln Pro Arg Arg CysArg 245 250 255 tct ggg ggc caa cag gca tcc acg agg tgt tca acg cgt cagacc taa 816 Ser Gly Gly Gln Gln Ala Ser Thr Arg Cys Ser Thr Arg Gln ThrXaa 260 265 270 cag cgg tgg acg ccg gca gcc agc ggc gcg agg cgt acg cgactg gtg 864 Gln Arg Trp Thr Pro Ala Ala Ser Gly Ala Arg Arg Thr Arg LeuVal 275 280 285 ctg cac tcg tcc gac cga tac ctg tgc ggc gcc atc gtg ctggcg cag 912 Leu His Ser Ser Asp Arg Tyr Leu Cys Gly Ala Ile Val Leu AlaGln 290 295 300 agc atc cgg cgg tcg ggc tcc acc cgc gac atg gtc ctc ctccac gac 960 Ser Ile Arg Arg Ser Gly Ser Thr Arg Asp Met Val Leu Leu HisAsp 305 310 315 320 cac acc gtc tcc aag ccg gcc ctc cgc gcg ctg gtc gccgcc ggc tgg 1008 His Thr Val Ser Lys Pro Ala Leu Arg Ala Leu Val Ala AlaGly Trp 325 330 335 atc ccg cgc agg atc cgg cgc atc cgc aac ccg cgc gcggag cgg ggc 1056 Ile Pro Arg Arg Ile Arg Arg Ile Arg Asn Pro Arg Ala GluArg Gly 340 345 350 tcc tac aac gag tac aac tac agc aag ttc cgg ctg tggcag ctg acg 1104 Ser Tyr Asn Glu Tyr Asn Tyr Ser Lys Phe Arg Leu Trp GlnLeu Thr 355 360 365 gag tac ttc cgc gtc gtc ttc atc gac gcc gac atc ctcgtc ctc cgc 1152 Glu Tyr Phe Arg Val Val Phe Ile Asp Ala Asp Ile Leu ValLeu Arg 370 375 380 tcc ctc gac gcg ctc ttc cgc ttc ccg cag atc tcc gccggg ggc aac 1200 Ser Leu Asp Ala Leu Phe Arg Phe Pro Gln Ile Ser Ala GlyGly Asn 385 390 395 400 gac ggc tcc ctc ttc aac tcg ggg aac atg gtg ctcgag ccg tcg gcg 1248 Asp Gly Ser Leu Phe Asn Ser Gly Asn Met Val Leu GluPro Ser Ala 405 410 415 tgc acc ttc gag gcg ctc gtc cgg ggg cgg cgc aca1284 Cys Thr Phe Glu Ala Leu Val Arg Gly Arg Arg Thr 420 425 21 271 PRTTriticum aestivum 21 Thr Arg Pro Leu Ala Phe Phe Phe Leu Val Leu His GlyPro Pro Ala 1 5 10 15 Pro Pro Gln Val Leu Pro His Pro Arg Pro Arg ArgLeu Leu Ser Gly 20 25 30 Pro Leu His Leu Pro Arg Arg Leu Pro Val His ValPro Pro Leu Thr 35 40 45 Glu Gly Lys Pro Gly Gly Arg Ser Val Ala Ala AlaAsn Lys Val Val 50 55 60 Ala Thr Glu Arg Ile Val Asn Ala Gly Arg Ala ProThr Met Phe Asn 65 70 75 80 Glu Leu Arg Gly Arg Leu Arg Met Gly Leu ValAsn Ile Gly Arg Asp 85 90 95 Glu Leu Leu Ala Leu Gly Val Glu Gly Asp AlaVal Gly Val Asp Phe 100 105 110 Asp Arg Val Ser Asp Val Phe Arg Trp SerAsp Leu Phe Pro Glu Trp 115 120 125 Ile Asp Glu Glu Glu Glu Asp Gly ValPro Ser Cys Pro Glu Ile Pro 130 135 140 Met Pro Asp Phe Ser Arg Tyr AspAsp Asp Gly Val Asp Val Val Val 145 150 155 160 Ala Ala Leu Pro Cys AsnArg Thr Ala Val Arg Gly Trp Asn Arg Asp 165 170 175 Val Phe Arg Leu GlnVal His Leu Val Ala Ala His Met Ala Ala Arg 180 185 190 Lys Trp Ala AlaArg Arg Arg Arg Pro Gly Ala Arg Gly Ala Ala Glu 195 200 205 Arg Val ArgAla Asp Asp Gly Pro Val Pro Val Arg Arg Val Arg Gly 210 215 220 Ala GlyGly Gly Leu Val Asp Val Gln Arg Arg Arg Ala Ala His Gly 225 230 235 240Gly Glu Ala Pro Ala Ala His Arg Leu Leu Gln Pro Arg Arg Cys Arg 245 250255 Ser Gly Gly Gln Gln Ala Ser Thr Arg Cys Ser Thr Arg Gln Thr 260 265270 22 156 PRT Triticum aestivum 22 Gln Arg Trp Thr Pro Ala Ala Ser GlyAla Arg Arg Thr Arg Leu Val 1 5 10 15 Leu His Ser Ser Asp Arg Tyr LeuCys Gly Ala Ile Val Leu Ala Gln 20 25 30 Ser Ile Arg Arg Ser Gly Ser ThrArg Asp Met Val Leu Leu His Asp 35 40 45 His Thr Val Ser Lys Pro Ala LeuArg Ala Leu Val Ala Ala Gly Trp 50 55 60 Ile Pro Arg Arg Ile Arg Arg IleArg Asn Pro Arg Ala Glu Arg Gly 65 70 75 80 Ser Tyr Asn Glu Tyr Asn TyrSer Lys Phe Arg Leu Trp Gln Leu Thr 85 90 95 Glu Tyr Phe Arg Val Val PheIle Asp Ala Asp Ile Leu Val Leu Arg 100 105 110 Ser Leu Asp Ala Leu PheArg Phe Pro Gln Ile Ser Ala Gly Gly Asn 115 120 125 Asp Gly Ser Leu PheAsn Ser Gly Asn Met Val Leu Glu Pro Ser Ala 130 135 140 Cys Thr Phe GluAla Leu Val Arg Gly Arg Arg Thr 145 150 155 23 2028 DNA Arabidopsisthaliana CDS (1)..(1854) 23 atg ata cct tcc tca agt ccc atg gag tca agacat cga ctc tcg ttc 48 Met Ile Pro Ser Ser Ser Pro Met Glu Ser Arg HisArg Leu Ser Phe 1 5 10 15 tca aat gag aag aca agt agg agg aga ttt caaaga att gag aag ggt 96 Ser Asn Glu Lys Thr Ser Arg Arg Arg Phe Gln ArgIle Glu Lys Gly 20 25 30 gtc aag ttc aac act ctg aaa ctt gtg ttg att tgtata atg ctt gga 144 Val Lys Phe Asn Thr Leu Lys Leu Val Leu Ile Cys IleMet Leu Gly 35 40 45 gct ttg ttc acg atc tac cgt ttt cgt tat cca ccg ctacaa att cct 192 Ala Leu Phe Thr Ile Tyr Arg Phe Arg Tyr Pro Pro Leu GlnIle Pro 50 55 60 gaa att cca act agt ttt ggt ctt act act gat cct cgc tatgta gct 240 Glu Ile Pro Thr Ser Phe Gly Leu Thr Thr Asp Pro Arg Tyr ValAla 65 70 75 80 aca gct gag atc aac tgg aac cat atg tca aat ctt gtt gagaag cac 288 Thr Ala Glu Ile Asn Trp Asn His Met Ser Asn Leu Val Glu LysHis 85 90 95 gta ttt ggt aga agc gag tat caa gga att ggt ctt ata aat cttaac 336 Val Phe Gly Arg Ser Glu Tyr Gln Gly Ile Gly Leu Ile Asn Leu Asn100 105 110 gat aac gag att gat cga ttc aag gag gta acg aaa tct gac tgtgat 384 Asp Asn Glu Ile Asp Arg Phe Lys Glu Val Thr Lys Ser Asp Cys Asp115 120 125 cat gta gct ttg cat cta gat tat gct gca aag aac ata aca tgggaa 432 His Val Ala Leu His Leu Asp Tyr Ala Ala Lys Asn Ile Thr Trp Glu130 135 140 tct tta tac ccg gaa tgg att gat gaa gtt gaa gaa ttc gaa gtccct 480 Ser Leu Tyr Pro Glu Trp Ile Asp Glu Val Glu Glu Phe Glu Val Pro145 150 155 160 act tgt cct tct ctg cct ttg att caa att cct ggc aag cctcgg att 528 Thr Cys Pro Ser Leu Pro Leu Ile Gln Ile Pro Gly Lys Pro ArgIle 165 170 175 gat ctt gta att gcc aag ctt ccg tgt gat aaa tca gga aaatgg tct 576 Asp Leu Val Ile Ala Lys Leu Pro Cys Asp Lys Ser Gly Lys TrpSer 180 185 190 aga gat gtg gct cgc ttg cat tta caa ctt gca gca gct cgagtg gcg 624 Arg Asp Val Ala Arg Leu His Leu Gln Leu Ala Ala Ala Arg ValAla 195 200 205 gct tct tct aaa gga ctt cat aat gtt cat gtg att ttg gtatct gat 672 Ala Ser Ser Lys Gly Leu His Asn Val His Val Ile Leu Val SerAsp 210 215 220 tgc ttt cca ata ccg aat ctt ttt acg ggt caa gaa ctt gttgcc cgt 720 Cys Phe Pro Ile Pro Asn Leu Phe Thr Gly Gln Glu Leu Val AlaArg 225 230 235 240 caa gga aac ata tgg ctg tat aag cct aat ctt cac cagcta aga caa 768 Gln Gly Asn Ile Trp Leu Tyr Lys Pro Asn Leu His Gln LeuArg Gln 245 250 255 aag tta cag ctt cct gtt ggt tcc tgt gaa ctt tct gttcct ctt caa 816 Lys Leu Gln Leu Pro Val Gly Ser Cys Glu Leu Ser Val ProLeu Gln 260 265 270 gct aaa gat aat ttc tac tcc gca ggt gca aag aaa gaagct tac gcg 864 Ala Lys Asp Asn Phe Tyr Ser Ala Gly Ala Lys Lys Glu AlaTyr Ala 275 280 285 act atc ttg cat tct gcc caa ttt tat gtc tgt gga gccatt gca gct 912 Thr Ile Leu His Ser Ala Gln Phe Tyr Val Cys Gly Ala IleAla Ala 290 295 300 gca cag agc att cga atg tca ggc tct act cgt gat ctggtc ata ctt 960 Ala Gln Ser Ile Arg Met Ser Gly Ser Thr Arg Asp Leu ValIle Leu 305 310 315 320 gtt gat gaa acg ata agc gaa tac cat aaa agt ggcttg gta gct gct 1008 Val Asp Glu Thr Ile Ser Glu Tyr His Lys Ser Gly LeuVal Ala Ala 325 330 335 gga tgg aag att caa atg ttt caa aga atc agg aacccg aat gct gta 1056 Gly Trp Lys Ile Gln Met Phe Gln Arg Ile Arg Asn ProAsn Ala Val 340 345 350 cca aat gcc tac aac gaa tgg aac tac agc aag tttcgt ctt tgg caa 1104 Pro Asn Ala Tyr Asn Glu Trp Asn Tyr Ser Lys Phe ArgLeu Trp Gln 355 360 365 ctg act gaa tac agt aag atc atc ttc atc gat gcagac atg ctt atc 1152 Leu Thr Glu Tyr Ser Lys Ile Ile Phe Ile Asp Ala AspMet Leu Ile 370 375 380 ctg aga aac att gat ttc ctc ttc gag ttc cct gagata tca gca act 1200 Leu Arg Asn Ile Asp Phe Leu Phe Glu Phe Pro Glu IleSer Ala Thr 385 390 395 400 gga aac aat gct acg ctc ttc aac tct ggt ctaatg gtg gtt gag cca 1248 Gly Asn Asn Ala Thr Leu Phe Asn Ser Gly Leu MetVal Val Glu Pro 405 410 415 tct aat tca aca ttc cag tta cta atg gat aacatt aat gaa gtt gtg 1296 Ser Asn Ser Thr Phe Gln Leu Leu Met Asp Asn IleAsn Glu Val Val 420 425 430 tct tac aac gga gga gac caa ggt tac ctt aacgag ata ttc aca tgg 1344 Ser Tyr Asn Gly Gly Asp Gln Gly Tyr Leu Asn GluIle Phe Thr Trp 435 440 445 tgg cat cgg att cca aaa cac atg aat ttc ttgaag cat ttc tgg gaa 1392 Trp His Arg Ile Pro Lys His Met Asn Phe Leu LysHis Phe Trp Glu 450 455 460 gga gac gaa cct gag att aaa aaa atg aag acgagt cta ttt gga gct 1440 Gly Asp Glu Pro Glu Ile Lys Lys Met Lys Thr SerLeu Phe Gly Ala 465 470 475 480 gat cct ccg atc cta tac gtt ctt cat taccta ggt tat aac aaa ccc 1488 Asp Pro Pro Ile Leu Tyr Val Leu His Tyr LeuGly Tyr Asn Lys Pro 485 490 495 tgg tta tgc ttc aga gac tat gac tgc aattgg aat gtc gat att ttc 1536 Trp Leu Cys Phe Arg Asp Tyr Asp Cys Asn TrpAsn Val Asp Ile Phe 500 505 510 cag gaa ttt gct agt gac gag gct cat aaaacc tgg tgg aga gtg cac 1584 Gln Glu Phe Ala Ser Asp Glu Ala His Lys ThrTrp Trp Arg Val His 515 520 525 gac gca atg cct gaa aac ttg cat aag ttctgt cta cta aga tcg aaa 1632 Asp Ala Met Pro Glu Asn Leu His Lys Phe CysLeu Leu Arg Ser Lys 530 535 540 cag aag gcg caa ctt gaa tgg gat agg agacaa gca gag aaa ggg aac 1680 Gln Lys Ala Gln Leu Glu Trp Asp Arg Arg GlnAla Glu Lys Gly Asn 545 550 555 560 tac aaa gat gga cat tgg aag ata aagatc aaa gac aag aga ctt aag 1728 Tyr Lys Asp Gly His Trp Lys Ile Lys IleLys Asp Lys Arg Leu Lys 565 570 575 act tgt ttc gaa gat ttc tgc ttt tgggag agt atg ctt tgg cat tgg 1776 Thr Cys Phe Glu Asp Phe Cys Phe Trp GluSer Met Leu Trp His Trp 580 585 590 ggt gag acg aac tct acc aac aat tcttcc acc acc acc act tca tca 1824 Gly Glu Thr Asn Ser Thr Asn Asn Ser SerThr Thr Thr Thr Ser Ser 595 600 605 ccg ccg cat aaa acc gct ctc cct tccctg tgaattcttt tggctttctg 1874 Pro Pro His Lys Thr Ala Leu Pro Ser Leu610 615 gtttggtaca aattactctg cctttcgcca accaaatgtg ggttggatatgttcttttgt 1934 ttttttatta tcagcttgaa acctgtatac gaatcccaga aacaatgtaatcatgagggg 1994 ataaaggaat gaaagacaaa taaagaattt acag 2028 24 618 PRTArabidopsis thaliana 24 Met Ile Pro Ser Ser Ser Pro Met Glu Ser Arg HisArg Leu Ser Phe 1 5 10 15 Ser Asn Glu Lys Thr Ser Arg Arg Arg Phe GlnArg Ile Glu Lys Gly 20 25 30 Val Lys Phe Asn Thr Leu Lys Leu Val Leu IleCys Ile Met Leu Gly 35 40 45 Ala Leu Phe Thr Ile Tyr Arg Phe Arg Tyr ProPro Leu Gln Ile Pro 50 55 60 Glu Ile Pro Thr Ser Phe Gly Leu Thr Thr AspPro Arg Tyr Val Ala 65 70 75 80 Thr Ala Glu Ile Asn Trp Asn His Met SerAsn Leu Val Glu Lys His 85 90 95 Val Phe Gly Arg Ser Glu Tyr Gln Gly IleGly Leu Ile Asn Leu Asn 100 105 110 Asp Asn Glu Ile Asp Arg Phe Lys GluVal Thr Lys Ser Asp Cys Asp 115 120 125 His Val Ala Leu His Leu Asp TyrAla Ala Lys Asn Ile Thr Trp Glu 130 135 140 Ser Leu Tyr Pro Glu Trp IleAsp Glu Val Glu Glu Phe Glu Val Pro 145 150 155 160 Thr Cys Pro Ser LeuPro Leu Ile Gln Ile Pro Gly Lys Pro Arg Ile 165 170 175 Asp Leu Val IleAla Lys Leu Pro Cys Asp Lys Ser Gly Lys Trp Ser 180 185 190 Arg Asp ValAla Arg Leu His Leu Gln Leu Ala Ala Ala Arg Val Ala 195 200 205 Ala SerSer Lys Gly Leu His Asn Val His Val Ile Leu Val Ser Asp 210 215 220 CysPhe Pro Ile Pro Asn Leu Phe Thr Gly Gln Glu Leu Val Ala Arg 225 230 235240 Gln Gly Asn Ile Trp Leu Tyr Lys Pro Asn Leu His Gln Leu Arg Gln 245250 255 Lys Leu Gln Leu Pro Val Gly Ser Cys Glu Leu Ser Val Pro Leu Gln260 265 270 Ala Lys Asp Asn Phe Tyr Ser Ala Gly Ala Lys Lys Glu Ala TyrAla 275 280 285 Thr Ile Leu His Ser Ala Gln Phe Tyr Val Cys Gly Ala IleAla Ala 290 295 300 Ala Gln Ser Ile Arg Met Ser Gly Ser Thr Arg Asp LeuVal Ile Leu 305 310 315 320 Val Asp Glu Thr Ile Ser Glu Tyr His Lys SerGly Leu Val Ala Ala 325 330 335 Gly Trp Lys Ile Gln Met Phe Gln Arg IleArg Asn Pro Asn Ala Val 340 345 350 Pro Asn Ala Tyr Asn Glu Trp Asn TyrSer Lys Phe Arg Leu Trp Gln 355 360 365 Leu Thr Glu Tyr Ser Lys Ile IlePhe Ile Asp Ala Asp Met Leu Ile 370 375 380 Leu Arg Asn Ile Asp Phe LeuPhe Glu Phe Pro Glu Ile Ser Ala Thr 385 390 395 400 Gly Asn Asn Ala ThrLeu Phe Asn Ser Gly Leu Met Val Val Glu Pro 405 410 415 Ser Asn Ser ThrPhe Gln Leu Leu Met Asp Asn Ile Asn Glu Val Val 420 425 430 Ser Tyr AsnGly Gly Asp Gln Gly Tyr Leu Asn Glu Ile Phe Thr Trp 435 440 445 Trp HisArg Ile Pro Lys His Met Asn Phe Leu Lys His Phe Trp Glu 450 455 460 GlyAsp Glu Pro Glu Ile Lys Lys Met Lys Thr Ser Leu Phe Gly Ala 465 470 475480 Asp Pro Pro Ile Leu Tyr Val Leu His Tyr Leu Gly Tyr Asn Lys Pro 485490 495 Trp Leu Cys Phe Arg Asp Tyr Asp Cys Asn Trp Asn Val Asp Ile Phe500 505 510 Gln Glu Phe Ala Ser Asp Glu Ala His Lys Thr Trp Trp Arg ValHis 515 520 525 Asp Ala Met Pro Glu Asn Leu His Lys Phe Cys Leu Leu ArgSer Lys 530 535 540 Gln Lys Ala Gln Leu Glu Trp Asp Arg Arg Gln Ala GluLys Gly Asn 545 550 555 560 Tyr Lys Asp Gly His Trp Lys Ile Lys Ile LysAsp Lys Arg Leu Lys 565 570 575 Thr Cys Phe Glu Asp Phe Cys Phe Trp GluSer Met Leu Trp His Trp 580 585 590 Gly Glu Thr Asn Ser Thr Asn Asn SerSer Thr Thr Thr Thr Ser Ser 595 600 605 Pro Pro His Lys Thr Ala Leu ProSer Leu 610 615 25 1845 DNA Oryza sativa CDS (1)..(1845) 25 atg ggg gtgacg ggc ggc gcc ggg gag gcc gtc aag ccg tcg tcg tcg 48 Met Gly Val ThrGly Gly Ala Gly Glu Ala Val Lys Pro Ser Ser Ser 1 5 10 15 tcg tcg ttgtcg ccg gtg gcg ggg ctg agg gcg gcg gcc atc gtg aag 96 Ser Ser Leu SerPro Val Ala Gly Leu Arg Ala Ala Ala Ile Val Lys 20 25 30 ctg aac gcg gcgttc ctc gcc ttc ttc ttc ctc gcg tac atg gcg ctc 144 Leu Asn Ala Ala PheLeu Ala Phe Phe Phe Leu Ala Tyr Met Ala Leu 35 40 45 ctc ctc cac ccc aagtac tcc tac ctc ctc gac cgc ggc gcc gcc tcc 192 Leu Leu His Pro Lys TyrSer Tyr Leu Leu Asp Arg Gly Ala Ala Ser 50 55 60 tcc ctc gtc cgc tgc accgcc ttc cgc gac gcc tgc acc ccg gcg acg 240 Ser Leu Val Arg Cys Thr AlaPhe Arg Asp Ala Cys Thr Pro Ala Thr 65 70 75 80 acg acc acc gcc cag ctctct cgg aag ctg gga ggc gtg gcg gcg aac 288 Thr Thr Thr Ala Gln Leu SerArg Lys Leu Gly Gly Val Ala Ala Asn 85 90 95 aag gcg gtg gcg gcg gcg gcggag agg atc gtg aac gcc ggg agg gcg 336 Lys Ala Val Ala Ala Ala Ala GluArg Ile Val Asn Ala Gly Arg Ala 100 105 110 ccg gcg atg ttc gac gag ctccgt ggg cgg ctg cgg atg ggc ctg gtg 384 Pro Ala Met Phe Asp Glu Leu ArgGly Arg Leu Arg Met Gly Leu Val 115 120 125 aac atc ggc cgc gac gag ctgctg gcg ctc ggc gtg gag ggc gac gcc 432 Asn Ile Gly Arg Asp Glu Leu LeuAla Leu Gly Val Glu Gly Asp Ala 130 135 140 gtc ggc gtc gac ttc gag cgcgtc tcc gac atg ttc cgg tgg tcg gac 480 Val Gly Val Asp Phe Glu Arg ValSer Asp Met Phe Arg Trp Ser Asp 145 150 155 160 ctc ttc ccg gag tgg atcgac gag gag gag gac gac gag ggc ccg tcc 528 Leu Phe Pro Glu Trp Ile AspGlu Glu Glu Asp Asp Glu Gly Pro Ser 165 170 175 tgc ccg gag ctc ccc atgccg gac ttc tcc cgg tac ggc gac gtc gac 576 Cys Pro Glu Leu Pro Met ProAsp Phe Ser Arg Tyr Gly Asp Val Asp 180 185 190 gtg gtg gtg gcg tcg ctgccg tgc aac cgt tcg gac gcc gcg tgg aac 624 Val Val Val Ala Ser Leu ProCys Asn Arg Ser Asp Ala Ala Trp Asn 195 200 205 cgc gac gtg ttc agg ctgcag gtg cac ctc gtg acg gcg cac atg gcg 672 Arg Asp Val Phe Arg Leu GlnVal His Leu Val Thr Ala His Met Ala 210 215 220 gcg cgc aag ggg ctg cggcac gac gcc ggc ggc ggc ggc ggc ggc ggg 720 Ala Arg Lys Gly Leu Arg HisAsp Ala Gly Gly Gly Gly Gly Gly Gly 225 230 235 240 cgg gtg cgc gtg gtggtg cgc agc gag tgc gag ccc atg atg gac ttg 768 Arg Val Arg Val Val ValArg Ser Glu Cys Glu Pro Met Met Asp Leu 245 250 255 ttc cgg tgc gac gaggcg gtg ggg agg gac ggc gag tgg tgg atg tac 816 Phe Arg Cys Asp Glu AlaVal Gly Arg Asp Gly Glu Trp Trp Met Tyr 260 265 270 atg gtc gac gtc gagcgg ctg gag gag aag ctc cgg ctt cct gtc ggc 864 Met Val Asp Val Glu ArgLeu Glu Glu Lys Leu Arg Leu Pro Val Gly 275 280 285 tca tgc aac ctc gcccta cct ctg tgg gga ccc gga ggt atc cag gaa 912 Ser Cys Asn Leu Ala LeuPro Leu Trp Gly Pro Gly Gly Ile Gln Glu 290 295 300 gtg ttc aac gtg tcggag ctg acg gcg gcg gcg gca acg gcg ggg cgg 960 Val Phe Asn Val Ser GluLeu Thr Ala Ala Ala Ala Thr Ala Gly Arg 305 310 315 320 ccg cgg cgg gaggcg tac gcg acg gtg ctc cac tcg tcg gac acg tac 1008 Pro Arg Arg Glu AlaTyr Ala Thr Val Leu His Ser Ser Asp Thr Tyr 325 330 335 ctg tgc ggc gcgatc gtg ctg gcg cag agc atc cgg cgc gcc ggg tcg 1056 Leu Cys Gly Ala IleVal Leu Ala Gln Ser Ile Arg Arg Ala Gly Ser 340 345 350 acg cgc gac ctcgtc ctc ctc cac gac cac acc gtg tcg aag ccg gcg 1104 Thr Arg Asp Leu ValLeu Leu His Asp His Thr Val Ser Lys Pro Ala 355 360 365 ctg gcg gcg ctggtc gcc gcc ggc tgg acc ccg cgc aag atc aag cgc 1152 Leu Ala Ala Leu ValAla Ala Gly Trp Thr Pro Arg Lys Ile Lys Arg 370 375 380 atc cgc aac ccgcgc gcg gag cgc ggc acc tac aac gag tac aac tac 1200 Ile Arg Asn Pro ArgAla Glu Arg Gly Thr Tyr Asn Glu Tyr Asn Tyr 385 390 395 400 agc aag ttccgg ctg tgg cag ctc acc gac tac gac cgc gtg gtg ttc 1248 Ser Lys Phe ArgLeu Trp Gln Leu Thr Asp Tyr Asp Arg Val Val Phe 405 410 415 gtc gac gccgac atc ctc gtc ctc cgc gac ctc gac gcc ctc ttc ggc 1296 Val Asp Ala AspIle Leu Val Leu Arg Asp Leu Asp Ala Leu Phe Gly 420 425 430 ttc ccg cagctg acg gcg gtg ggc aac gac ggc tcg ctc ttc aac tcc 1344 Phe Pro Gln LeuThr Ala Val Gly Asn Asp Gly Ser Leu Phe Asn Ser 435 440 445 ggg gtg atggtg atc gag ccg tcg cag tgc acg ttc cag tcg ctg atc 1392 Gly Val Met ValIle Glu Pro Ser Gln Cys Thr Phe Gln Ser Leu Ile 450 455 460 cgg cag cggcgg acc atc cgg tcc tac aac ggc ggc gat cag ggg ttc 1440 Arg Gln Arg ArgThr Ile Arg Ser Tyr Asn Gly Gly Asp Gln Gly Phe 465 470 475 480 ctg aacgag gtg ttc gtc tgg tgg cac cgg ctg ccg cgg cgg gtg aac 1488 Leu Asn GluVal Phe Val Trp Trp His Arg Leu Pro Arg Arg Val Asn 485 490 495 tac ctcaag aac ttc tgg gcg aac act acg gcg gag cgg gcg ctc aag 1536 Tyr Leu LysAsn Phe Trp Ala Asn Thr Thr Ala Glu Arg Ala Leu Lys 500 505 510 gag cggctg ttc cgg gcg gat ccc gcg gag gtg tgg tcg atc cac tac 1584 Glu Arg LeuPhe Arg Ala Asp Pro Ala Glu Val Trp Ser Ile His Tyr 515 520 525 ctg gggctg aag ccg tgg acg tgc tac cgc gac tac gac tgc aac tgg 1632 Leu Gly LeuLys Pro Trp Thr Cys Tyr Arg Asp Tyr Asp Cys Asn Trp 530 535 540 aac atcggc gac cag cgg gtg tac gcc agc gac gcc gcg cac gcg cgg 1680 Asn Ile GlyAsp Gln Arg Val Tyr Ala Ser Asp Ala Ala His Ala Arg 545 550 555 560 tggtgg cag gtg tac gac gac atg ggg gag gcc atg cgc tcg ccg tgc 1728 Trp TrpGln Val Tyr Asp Asp Met Gly Glu Ala Met Arg Ser Pro Cys 565 570 575 cgcctg tcg gag cgg agg aag atc gag atc gcc tgg gac cga cac ctc 1776 Arg LeuSer Glu Arg Arg Lys Ile Glu Ile Ala Trp Asp Arg His Leu 580 585 590 gccgag gag gcc ggc ttc tcc gac cac cac tgg aag atc aac atc acc 1824 Ala GluGlu Ala Gly Phe Ser Asp His His Trp Lys Ile Asn Ile Thr 595 600 605 gacccc cgc aag tgg gag tag 1845 Asp Pro Arg Lys Trp Glu * 610 26 614 PRTOryza sativa 26 Met Gly Val Thr Gly Gly Ala Gly Glu Ala Val Lys Pro SerSer Ser 1 5 10 15 Ser Ser Leu Ser Pro Val Ala Gly Leu Arg Ala Ala AlaIle Val Lys 20 25 30 Leu Asn Ala Ala Phe Leu Ala Phe Phe Phe Leu Ala TyrMet Ala Leu 35 40 45 Leu Leu His Pro Lys Tyr Ser Tyr Leu Leu Asp Arg GlyAla Ala Ser 50 55 60 Ser Leu Val Arg Cys Thr Ala Phe Arg Asp Ala Cys ThrPro Ala Thr 65 70 75 80 Thr Thr Thr Ala Gln Leu Ser Arg Lys Leu Gly GlyVal Ala Ala Asn 85 90 95 Lys Ala Val Ala Ala Ala Ala Glu Arg Ile Val AsnAla Gly Arg Ala 100 105 110 Pro Ala Met Phe Asp Glu Leu Arg Gly Arg LeuArg Met Gly Leu Val 115 120 125 Asn Ile Gly Arg Asp Glu Leu Leu Ala LeuGly Val Glu Gly Asp Ala 130 135 140 Val Gly Val Asp Phe Glu Arg Val SerAsp Met Phe Arg Trp Ser Asp 145 150 155 160 Leu Phe Pro Glu Trp Ile AspGlu Glu Glu Asp Asp Glu Gly Pro Ser 165 170 175 Cys Pro Glu Leu Pro MetPro Asp Phe Ser Arg Tyr Gly Asp Val Asp 180 185 190 Val Val Val Ala SerLeu Pro Cys Asn Arg Ser Asp Ala Ala Trp Asn 195 200 205 Arg Asp Val PheArg Leu Gln Val His Leu Val Thr Ala His Met Ala 210 215 220 Ala Arg LysGly Leu Arg His Asp Ala Gly Gly Gly Gly Gly Gly Gly 225 230 235 240 ArgVal Arg Val Val Val Arg Ser Glu Cys Glu Pro Met Met Asp Leu 245 250 255Phe Arg Cys Asp Glu Ala Val Gly Arg Asp Gly Glu Trp Trp Met Tyr 260 265270 Met Val Asp Val Glu Arg Leu Glu Glu Lys Leu Arg Leu Pro Val Gly 275280 285 Ser Cys Asn Leu Ala Leu Pro Leu Trp Gly Pro Gly Gly Ile Gln Glu290 295 300 Val Phe Asn Val Ser Glu Leu Thr Ala Ala Ala Ala Thr Ala GlyArg 305 310 315 320 Pro Arg Arg Glu Ala Tyr Ala Thr Val Leu His Ser SerAsp Thr Tyr 325 330 335 Leu Cys Gly Ala Ile Val Leu Ala Gln Ser Ile ArgArg Ala Gly Ser 340 345 350 Thr Arg Asp Leu Val Leu Leu His Asp His ThrVal Ser Lys Pro Ala 355 360 365 Leu Ala Ala Leu Val Ala Ala Gly Trp ThrPro Arg Lys Ile Lys Arg 370 375 380 Ile Arg Asn Pro Arg Ala Glu Arg GlyThr Tyr Asn Glu Tyr Asn Tyr 385 390 395 400 Ser Lys Phe Arg Leu Trp GlnLeu Thr Asp Tyr Asp Arg Val Val Phe 405 410 415 Val Asp Ala Asp Ile LeuVal Leu Arg Asp Leu Asp Ala Leu Phe Gly 420 425 430 Phe Pro Gln Leu ThrAla Val Gly Asn Asp Gly Ser Leu Phe Asn Ser 435 440 445 Gly Val Met ValIle Glu Pro Ser Gln Cys Thr Phe Gln Ser Leu Ile 450 455 460 Arg Gln ArgArg Thr Ile Arg Ser Tyr Asn Gly Gly Asp Gln Gly Phe 465 470 475 480 LeuAsn Glu Val Phe Val Trp Trp His Arg Leu Pro Arg Arg Val Asn 485 490 495Tyr Leu Lys Asn Phe Trp Ala Asn Thr Thr Ala Glu Arg Ala Leu Lys 500 505510 Glu Arg Leu Phe Arg Ala Asp Pro Ala Glu Val Trp Ser Ile His Tyr 515520 525 Leu Gly Leu Lys Pro Trp Thr Cys Tyr Arg Asp Tyr Asp Cys Asn Trp530 535 540 Asn Ile Gly Asp Gln Arg Val Tyr Ala Ser Asp Ala Ala His AlaArg 545 550 555 560 Trp Trp Gln Val Tyr Asp Asp Met Gly Glu Ala Met ArgSer Pro Cys 565 570 575 Arg Leu Ser Glu Arg Arg Lys Ile Glu Ile Ala TrpAsp Arg His Leu 580 585 590 Ala Glu Glu Ala Gly Phe Ser Asp His His TrpLys Ile Asn Ile Thr 595 600 605 Asp Pro Arg Lys Trp Glu 610 27 626 DNAZea mays CDS (133)..(624) 27 ttcgagcggc cgccccgggc aggtacaaac ctgacgtgaaggctctaaag gagaagctca 60 ggctgcctgt tggttcctgt gagcttgctg ttccactcaacgcaaaagca cgactcttac 120 acggtagaca ga cgc aga gaa gca tat gct aca atactt cat tca gca agt 171 Arg Arg Glu Ala Tyr Ala Thr Ile Leu His Ser AlaSer 1 5 10 gaa tat gtt tgc ggt gcg ata aca gca gct caa agc att cgt caagca 219 Glu Tyr Val Cys Gly Ala Ile Thr Ala Ala Gln Ser Ile Arg Gln Ala15 20 25 gga tca aca aga gac ctt gtt att ctt gtt gat gac acc ata agt gac267 Gly Ser Thr Arg Asp Leu Val Ile Leu Val Asp Asp Thr Ile Ser Asp 3035 40 45 cac cac cgc aag ggg ctg gaa tct gct ggg tgg aag gtc aga ata ata315 His His Arg Lys Gly Leu Glu Ser Ala Gly Trp Lys Val Arg Ile Ile 5055 60 gaa agg atc cgg aat ccc aaa gcc gaa cgt gat gcc tac aac gaa tgg363 Glu Arg Ile Arg Asn Pro Lys Ala Glu Arg Asp Ala Tyr Asn Glu Trp 6570 75 aac tac agc aaa ttc cgg ctg tgg cag ctt aca gat tac gac aag gtt411 Asn Tyr Ser Lys Phe Arg Leu Trp Gln Leu Thr Asp Tyr Asp Lys Val 8085 90 att ttc att gat gct gat ctg ctc atc ctg agg aac att gat ttc ttg459 Ile Phe Ile Asp Ala Asp Leu Leu Ile Leu Arg Asn Ile Asp Phe Leu 95100 105 ttt gca atg cca gaa atc acc gca act ggg aac aat gcc aca ctc ttc507 Phe Ala Met Pro Glu Ile Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe 110115 120 125 aac tct ggg gtg atg gtc att gaa cct tca aac tgc acg ttc cagtta 555 Asn Ser Gly Val Met Val Ile Glu Pro Ser Asn Cys Thr Phe Gln Leu130 135 140 ctg atg gag cac atc aac gag ata aca tct tac aac ggt ggt gaccaa 603 Leu Met Glu His Ile Asn Glu Ile Thr Ser Tyr Asn Gly Gly Asp Gln145 150 155 ggg tac ctc ggc cgc gac cac gc 626 Gly Tyr Leu Gly Arg AspHis 160 28 164 PRT Zea mays 28 Arg Arg Glu Ala Tyr Ala Thr Ile Leu HisSer Ala Ser Glu Tyr Val 1 5 10 15 Cys Gly Ala Ile Thr Ala Ala Gln SerIle Arg Gln Ala Gly Ser Thr 20 25 30 Arg Asp Leu Val Ile Leu Val Asp AspThr Ile Ser Asp His His Arg 35 40 45 Lys Gly Leu Glu Ser Ala Gly Trp LysVal Arg Ile Ile Glu Arg Ile 50 55 60 Arg Asn Pro Lys Ala Glu Arg Asp AlaTyr Asn Glu Trp Asn Tyr Ser 65 70 75 80 Lys Phe Arg Leu Trp Gln Leu ThrAsp Tyr Asp Lys Val Ile Phe Ile 85 90 95 Asp Ala Asp Leu Leu Ile Leu ArgAsn Ile Asp Phe Leu Phe Ala Met 100 105 110 Pro Glu Ile Thr Ala Thr GlyAsn Asn Ala Thr Leu Phe Asn Ser Gly 115 120 125 Val Met Val Ile Glu ProSer Asn Cys Thr Phe Gln Leu Leu Met Glu 130 135 140 His Ile Asn Glu IleThr Ser Tyr Asn Gly Gly Asp Gln Gly Tyr Leu 145 150 155 160 Gly Arg AspHis 29 553 DNA Zea mays CDS (1)..(552) 29 tgg aag gtc aga ata ata gaaagg atc cgg aat ccc aaa gcc gaa cgt 48 Trp Lys Val Arg Ile Ile Glu ArgIle Arg Asn Pro Lys Ala Glu Arg 1 5 10 15 gat gcc tac aac gaa tgg aactac agc aaa ttc cgg ctg tgg cag ctt 96 Asp Ala Tyr Asn Glu Trp Asn TyrSer Lys Phe Arg Leu Trp Gln Leu 20 25 30 aca gat tac gac aag gtt att ttcatt gat gct gat ctg ctc atc ctg 144 Thr Asp Tyr Asp Lys Val Ile Phe IleAsp Ala Asp Leu Leu Ile Leu 35 40 45 agg aac att gat ttc ttg ttt gca atgcca gaa atc acc gca act ggg 192 Arg Asn Ile Asp Phe Leu Phe Ala Met ProGlu Ile Thr Ala Thr Gly 50 55 60 aac aat gcc aca ctc ttc aac tct ggg gtgatg gtc att gaa cct tca 240 Asn Asn Ala Thr Leu Phe Asn Ser Gly Val MetVal Ile Glu Pro Ser 65 70 75 80 aac tgc acg ttc cag tta ctg atg gag cacatc aac gag ata aca tct 288 Asn Cys Thr Phe Gln Leu Leu Met Glu His IleAsn Glu Ile Thr Ser 85 90 95 tac aac ggt ggt gac caa ggg tac ctg aac gagata ttc aca tgg tgg 336 Tyr Asn Gly Gly Asp Gln Gly Tyr Leu Asn Glu IlePhe Thr Trp Trp 100 105 110 cac cgg att cca aag cac atg aat ttc ttg aagcat ttc tgg gag ggt 384 His Arg Ile Pro Lys His Met Asn Phe Leu Lys HisPhe Trp Glu Gly 115 120 125 gat gag gac gaa gtg aag gcc aag aag act cggctg ttc ggc gcc aac 432 Asp Glu Asp Glu Val Lys Ala Lys Lys Thr Arg LeuPhe Gly Ala Asn 130 135 140 cca ccg atc ctc tac gtt ctc cac tac ttg gggcgg aag cca tgg ctg 480 Pro Pro Ile Leu Tyr Val Leu His Tyr Leu Gly ArgLys Pro Trp Leu 145 150 155 160 tgc ttc cgg gac tac gat tgc aac tgg aacgtc gag atc ttg cgg gag 528 Cys Phe Arg Asp Tyr Asp Cys Asn Trp Asn ValGlu Ile Leu Arg Glu 165 170 175 ttt gcg agt gac gtt gcg cat gcc c 553Phe Ala Ser Asp Val Ala His Ala 180 30 184 PRT Zea mays 30 Trp Lys ValArg Ile Ile Glu Arg Ile Arg Asn Pro Lys Ala Glu Arg 1 5 10 15 Asp AlaTyr Asn Glu Trp Asn Tyr Ser Lys Phe Arg Leu Trp Gln Leu 20 25 30 Thr AspTyr Asp Lys Val Ile Phe Ile Asp Ala Asp Leu Leu Ile Leu 35 40 45 Arg AsnIle Asp Phe Leu Phe Ala Met Pro Glu Ile Thr Ala Thr Gly 50 55 60 Asn AsnAla Thr Leu Phe Asn Ser Gly Val Met Val Ile Glu Pro Ser 65 70 75 80 AsnCys Thr Phe Gln Leu Leu Met Glu His Ile Asn Glu Ile Thr Ser 85 90 95 TyrAsn Gly Gly Asp Gln Gly Tyr Leu Asn Glu Ile Phe Thr Trp Trp 100 105 110His Arg Ile Pro Lys His Met Asn Phe Leu Lys His Phe Trp Glu Gly 115 120125 Asp Glu Asp Glu Val Lys Ala Lys Lys Thr Arg Leu Phe Gly Ala Asn 130135 140 Pro Pro Ile Leu Tyr Val Leu His Tyr Leu Gly Arg Lys Pro Trp Leu145 150 155 160 Cys Phe Arg Asp Tyr Asp Cys Asn Trp Asn Val Glu Ile LeuArg Glu 165 170 175 Phe Ala Ser Asp Val Ala His Ala 180 31 552 DNA Zeamays CDS (1)..(552) 31 tcc ctg cgc cgg ctc agc ccc aac gcc gac cgc gtcgtc atc gcg tcc 48 Ser Leu Arg Arg Leu Ser Pro Asn Ala Asp Arg Val ValIle Ala Ser 1 5 10 15 ctc gac gtc ccg ccg ctc tgg gtt cag gca ctg aaaaat gac ggg gta 96 Leu Asp Val Pro Pro Leu Trp Val Gln Ala Leu Lys AsnAsp Gly Val 20 25 30 aag gtg gtc tct gtg gag aat ttg aaa aat cct tac gagaaa caa gaa 144 Lys Val Val Ser Val Glu Asn Leu Lys Asn Pro Tyr Glu LysGln Glu 35 40 45 aat ttc aac aga cga ttc aaa ttg act tta aac aag ctg tatgca tgg 192 Asn Phe Asn Arg Arg Phe Lys Leu Thr Leu Asn Lys Leu Tyr AlaTrp 50 55 60 agc ttg gtt tca tat gag cga gtt gtt atg ctt gac tct gac aacatt 240 Ser Leu Val Ser Tyr Glu Arg Val Val Met Leu Asp Ser Asp Asn Ile65 70 75 80 ttc ctc caa aat act gat gag tta ttt cag tgt ggt cag ttc tgtgct 288 Phe Leu Gln Asn Thr Asp Glu Leu Phe Gln Cys Gly Gln Phe Cys Ala85 90 95 gtc ttc atc aat ccc tgt atc ttc cat aca ggt ctt ttt gtg ctt cag336 Val Phe Ile Asn Pro Cys Ile Phe His Thr Gly Leu Phe Val Leu Gln 100105 110 ccc tca atg gat gtt ttt aag aac atg cta cat gag cta gcg gtt gga384 Pro Ser Met Asp Val Phe Lys Asn Met Leu His Glu Leu Ala Val Gly 115120 125 cgt gaa aac cca gat ggg gca gac caa ggc ttc ctt gct agt tat ttc432 Arg Glu Asn Pro Asp Gly Ala Asp Gln Gly Phe Leu Ala Ser Tyr Phe 130135 140 ccg gac ttg ctt gat cag cca atg ttc cat cca cca gct aat ggt aca480 Pro Asp Leu Leu Asp Gln Pro Met Phe His Pro Pro Ala Asn Gly Thr 145150 155 160 aaa ctt tgg ggt act tat cgc ctc ccc cta ggc tac cag atg gatgca 528 Lys Leu Trp Gly Thr Tyr Arg Leu Pro Leu Gly Tyr Gln Met Asp Ala165 170 175 tct tac tat tat ctg aag ctt cgc 552 Ser Tyr Tyr Tyr Leu LysLeu Arg 180 32 184 PRT Zea mays 32 Ser Leu Arg Arg Leu Ser Pro Asn AlaAsp Arg Val Val Ile Ala Ser 1 5 10 15 Leu Asp Val Pro Pro Leu Trp ValGln Ala Leu Lys Asn Asp Gly Val 20 25 30 Lys Val Val Ser Val Glu Asn LeuLys Asn Pro Tyr Glu Lys Gln Glu 35 40 45 Asn Phe Asn Arg Arg Phe Lys LeuThr Leu Asn Lys Leu Tyr Ala Trp 50 55 60 Ser Leu Val Ser Tyr Glu Arg ValVal Met Leu Asp Ser Asp Asn Ile 65 70 75 80 Phe Leu Gln Asn Thr Asp GluLeu Phe Gln Cys Gly Gln Phe Cys Ala 85 90 95 Val Phe Ile Asn Pro Cys IlePhe His Thr Gly Leu Phe Val Leu Gln 100 105 110 Pro Ser Met Asp Val PheLys Asn Met Leu His Glu Leu Ala Val Gly 115 120 125 Arg Glu Asn Pro AspGly Ala Asp Gln Gly Phe Leu Ala Ser Tyr Phe 130 135 140 Pro Asp Leu LeuAsp Gln Pro Met Phe His Pro Pro Ala Asn Gly Thr 145 150 155 160 Lys LeuTrp Gly Thr Tyr Arg Leu Pro Leu Gly Tyr Gln Met Asp Ala 165 170 175 SerTyr Tyr Tyr Leu Lys Leu Arg 180 33 560 DNA Zea mays CDS (1)..(558) 33aaa cct gac gtg aag gcg ttg aag gag aag ctc agg ctg cct gtt ggt 48 LysPro Asp Val Lys Ala Leu Lys Glu Lys Leu Arg Leu Pro Val Gly 1 5 10 15tcc tgt gag ctt gct gtt cca ctc aac gca aaa gca cga ctc tac aca 96 SerCys Glu Leu Ala Val Pro Leu Asn Ala Lys Ala Arg Leu Tyr Thr 20 25 30 gtagac aga cgc aga gaa gca tat gcg aca ata ctg cat tca gca agt 144 Val AspArg Arg Arg Glu Ala Tyr Ala Thr Ile Leu His Ser Ala Ser 35 40 45 gaa tatgtt tgc ggc gcg atc acg gca gct caa agc att cgt caa gca 192 Glu Tyr ValCys Gly Ala Ile Thr Ala Ala Gln Ser Ile Arg Gln Ala 50 55 60 gga tca acaaga gac ctc gtt att ctc gtc gac gac acc ata agt gac 240 Gly Ser Thr ArgAsp Leu Val Ile Leu Val Asp Asp Thr Ile Ser Asp 65 70 75 80 cac cac cgcaag ggg ctg caa tct gcg ggg tgg aag gtc agg ata ata 288 His His Arg LysGly Leu Gln Ser Ala Gly Trp Lys Val Arg Ile Ile 85 90 95 cag agg atc cggaac ccc aaa gcc gag cgc gac gcc tac aac gag tgg 336 Gln Arg Ile Arg AsnPro Lys Ala Glu Arg Asp Ala Tyr Asn Glu Trp 100 105 110 aac tac agc aaattc cgg ctg tgg cag ctc acg gat tac gac aag gtc 384 Asn Tyr Ser Lys PheArg Leu Trp Gln Leu Thr Asp Tyr Asp Lys Val 115 120 125 atc ttc atc gacgcg gat ctc ctc atc ctg agg aac atc gat ttc ctg 432 Ile Phe Ile Asp AlaAsp Leu Leu Ile Leu Arg Asn Ile Asp Phe Leu 130 135 140 ttc gcg ctg ccggag atc acg gcg acg ggg aac aac gcg acg ctc ttc 480 Phe Ala Leu Pro GluIle Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe 145 150 155 160 aac tcg ggagtg atg gtc atc gag cct tcg aac tgc acg ttc cgg cta 528 Asn Ser Gly ValMet Val Ile Glu Pro Ser Asn Cys Thr Phe Arg Leu 165 170 175 ctg atg gagcac atc gac gag ata acg tcg ta 560 Leu Met Glu His Ile Asp Glu Ile ThrSer 180 185 34 186 PRT Zea mays 34 Lys Pro Asp Val Lys Ala Leu Lys GluLys Leu Arg Leu Pro Val Gly 1 5 10 15 Ser Cys Glu Leu Ala Val Pro LeuAsn Ala Lys Ala Arg Leu Tyr Thr 20 25 30 Val Asp Arg Arg Arg Glu Ala TyrAla Thr Ile Leu His Ser Ala Ser 35 40 45 Glu Tyr Val Cys Gly Ala Ile ThrAla Ala Gln Ser Ile Arg Gln Ala 50 55 60 Gly Ser Thr Arg Asp Leu Val IleLeu Val Asp Asp Thr Ile Ser Asp 65 70 75 80 His His Arg Lys Gly Leu GlnSer Ala Gly Trp Lys Val Arg Ile Ile 85 90 95 Gln Arg Ile Arg Asn Pro LysAla Glu Arg Asp Ala Tyr Asn Glu Trp 100 105 110 Asn Tyr Ser Lys Phe ArgLeu Trp Gln Leu Thr Asp Tyr Asp Lys Val 115 120 125 Ile Phe Ile Asp AlaAsp Leu Leu Ile Leu Arg Asn Ile Asp Phe Leu 130 135 140 Phe Ala Leu ProGlu Ile Thr Ala Thr Gly Asn Asn Ala Thr Leu Phe 145 150 155 160 Asn SerGly Val Met Val Ile Glu Pro Ser Asn Cys Thr Phe Arg Leu 165 170 175 LeuMet Glu His Ile Asp Glu Ile Thr Ser 180 185 35 566 PRT Arabidopsisthaliana 35 Met Gly Ala Lys Ser Lys Ser Ser Ser Thr Arg Phe Phe Met PheTyr 1 5 10 15 Leu Ile Leu Ile Ser Leu Ser Phe Leu Gly Leu Leu Leu AsnPhe Lys 20 25 30 Pro Leu Phe Leu Leu Asn Pro Met Ile Ala Ser Pro Ser IleVal Glu 35 40 45 Ile Arg Tyr Ser Leu Pro Glu Pro Val Lys Arg Thr Pro IleTrp Leu 50 55 60 Arg Leu Ile Arg Asn Tyr Leu Pro Asp Glu Lys Lys Ile ArgVal Gly 65 70 75 80 Leu Leu Asn Ile Ala Glu Asn Glu Arg Glu Ser Tyr GluAla Ser Gly 85 90 95 Thr Ser Ile Leu Glu Asn Val His Val Ser Leu Asp ProLeu Pro Asn 100 105 110 Asn Leu Thr Trp Thr Ser Leu Phe Pro Val Trp IleAsp Glu Asp His 115 120 125 Thr Trp His Ile Pro Ser Cys Pro Glu Val ProLeu Pro Lys Met Glu 130 135 140 Gly Ser Glu Ala Asp Val Asp Val Val ValVal Lys Val Pro Cys Asp 145 150 155 160 Gly Phe Ser Glu Lys Arg Gly LeuArg Asp Val Phe Arg Leu Gln Val 165 170 175 Asn Leu Ala Ala Ala Asn LeuVal Val Glu Ser Gly Arg Arg Asn Val 180 185 190 Asp Arg Thr Val Tyr ValVal Phe Ile Gly Ser Cys Gly Pro Met His 195 200 205 Glu Ile Phe Arg CysAsp Glu Arg Val Lys Arg Val Gly Asp Tyr Trp 210 215 220 Val Tyr Arg ProAsp Leu Thr Arg Leu Lys Gln Lys Leu Leu Met Pro 225 230 235 240 Pro GlySer Cys Gln Ile Ala Pro Leu Gly Gln Gly Glu Ala Trp Ile 245 250 255 GlnAsp Lys Asn Arg Asn Leu Thr Ser Glu Lys Thr Thr Leu Ser Ser 260 265 270Phe Thr Ala Gln Arg Val Ala Tyr Val Thr Leu Leu His Ser Ser Glu 275 280285 Val Tyr Val Cys Gly Ala Ile Ala Leu Ala Gln Ser Ile Arg Gln Ser 290295 300 Gly Ser Thr Lys Asp Met Ile Leu Leu His Asp Asp Ser Ile Thr Asn305 310 315 320 Ile Ser Leu Ile Gly Leu Ser Leu Ala Gly Trp Lys Leu ArgArg Val 325 330 335 Glu Arg Ile Arg Ser Pro Phe Ser Lys Lys Arg Ser TyrAsn Glu Trp 340 345 350 Asn Tyr Ser Lys Leu Arg Val Trp Gln Val Thr AspTyr Asp Lys Leu 355 360 365 Val Phe Ile Asp Ala Asp Phe Ile Ile Val LysAsn Ile Asp Tyr Leu 370 375 380 Phe Ser Tyr Pro Gln Leu Ser Ala Ala GlyAsn Asn Lys Val Leu Phe 385 390 395 400 Asn Ser Gly Val Met Val Leu GluPro Ser Ala Cys Leu Phe Glu Asp 405 410 415 Leu Met Leu Lys Ser Phe LysIle Gly Ser Tyr Asn Gly Gly Asp Gln 420 425 430 Gly Phe Leu Asn Glu TyrPhe Val Trp Trp His Arg Leu Ser Lys Arg 435 440 445 Leu Asn Thr Met LysTyr Phe Gly Asp Glu Ser Arg His Asp Lys Ala 450 455 460 Arg Asn Leu ProGlu Asn Leu Glu Gly Ile His Tyr Leu Gly Leu Lys 465 470 475 480 Pro TrpArg Cys Tyr Arg Asp Tyr Asp Cys Asn Trp Asp Leu Lys Thr 485 490 495 ArgArg Val Tyr Ala Ser Glu Ser Val His Ala Arg Trp Trp Lys Val 500 505 510Tyr Asp Lys Met Pro Lys Lys Leu Lys Gly Tyr Cys Gly Leu Asn Leu 515 520525 Lys Met Glu Lys Asn Val Glu Lys Trp Arg Lys Met Ala Lys Leu Asn 530535 540 Gly Phe Pro Glu Asn His Trp Lys Ile Arg Ile Lys Asp Pro Arg Lys545 550 555 560 Lys Asn Arg Leu Ser Glu 565

What is claimed is:
 1. An isolated nucleic acid molecule that: (i)comprises a nucleotide sequence which encodes a polypeptide comprisingthe amino acid sequence of SEQ ID NO: 3, or a fragment thereof; (ii)comprises a nucleotide sequence at least 40% identical to SEQ ID NOs: 1or 2, or a complement thereof as determined using the BESTFIT or GAPprograms with a gap weight of 50 and a length weight of 3; or (iii)hybridizes to a nucleic acid molecule consisting of SEQ ID NOs: 1 or 2under conditions of hybridization comprising washing at 60° C. twice for15 minutes in 2×SSC, 0.5% SDS or a complement thereof.
 2. The isolatednucleic acid molecule of claim 1, wherein the nucleic acid oleculecomprises SEQ ID NOs: 1 or 2, or a complement thereof.
 3. The isolatednucleic acid molecule of claim 1, comprising a nucleotide sequenceselected from the group consisting of nucleotide residues 377 to 423,516 to 592, 1039 to 1655, 1762 to 2536, and 2991 to 3264 of SEQ IDNO:
 1. 4. An isolated nucleic acid molecule that: (i) comprises anucleotide sequence which encodes a polypeptide comprising the aminoacid sequence of SEQ ID NO: 11, or a fragment thereof; (ii) comprises anucleotide sequence at least 70% identical to SEQ ID NO: 10, or acomplement thereof as determined using the BESTFIT or GAP programs witha gap weight of 50 and a length weight of 3, wherein the nucleotidesequence does not encode the amino acid sequence set forth in SEQ ID NO:35; or (iii) hybridizes to a nucleic acid molecule consisting of SEQ IDNO: 10 under stringent conditions of hybridization, or a complementthereof, wherein the nucleotide sequence does not encode the amino acidsequence set forth in SEQ ID NO:
 35. 5. The isolated nucleic acidmolecule of claim 4, wherein the nucleic acid molecule comprises SEQ IDNO: 10, or a complement thereof.
 6. An isolated nucleic acid moleculewhich encodes a polypeptide comprising the amino acid sequence that isat least 98% identical to SEQ ID NO:
 9. 7. An isolated nucleic acidmolecule thereof comprising the nucleotide sequence of SEQ ID NO: 8, ora complement thereof.
 8. An isolated nucleic acid molecule that: (i)comprises a nucleotide sequence which encodes a polypeptide comprisingthe amino acid sequence of SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24,26, 28, 30, 32, 34, or a fragment thereof, (ii) comprises a nucleotidesequence at least 70% identical to SEQ ID NOs: 4, 5, 6, 12, 14, 16, 18,20, 23, 25, 27, 29, 31, 33, or a complement thereof as determined usingthe BESTFIT or GAP programs with a gap weight of 12 and a length weightof 4; or (iii) hybridizes to a nucleic acid molecule consisting of SEQID NOs: 4, 5, 6, 12, 14, 16, 18, 20, 23, 25, 27, 29, 31, 33 understringent conditions of hybridization, or a complement thereof.
 9. Theisolated nucleic acid molecule of claim 8, wherein the nucleic acidmolecule comprises SEQ ID NOs: 4, 5, 6, 12, 14, 16, 18, 20, 23, 25, 27,29, 31, 33, or a complement thereof.
 10. A fragment of the isolatednucleic acid molecule of any one of claims 1-9, wherein the fragmentcomprises at least 40, 60, 80, 100 or 150 contiguous nucleotides of thenucleic acid molecule.
 11. The isolated nucleic acid molecule of claim 1comprising the nucleotide sequence of nucleotides 1-195 of SEQ ID NO: 2,or a complement thereof.
 12. An isolated polypeptide comprising theamino acid sequence of amino acid residues 1-65 of SEQ ID NO: 3, or afragment thereof.
 13. An isolated polypeptide comprising: (i) an aminoacid sequence that is at least 70% identical to SEQ ID NO: 3 or afragment thereof as determined using the BESTFIT or GAP programs with agap weight of 12 and a length weight of 4; (ii) an amino acid sequenceencoded by the nucleic acid molecule of claim 1; or (iii) an amino acidsequence of SEQ ID NO:
 3. 14. An isolated polypeptide comprising: (i) anamino acid sequence at least 70% identical to SEQ ID NO: 11, or afragment thereof as determined using the BESTFIT or GAP programs with agap weight of 12 and a length weight of 4; (ii) an amino acid sequenceencoded by the nucleic acid molecule of claim 4; or (iii) an amino acidsequence of SEQ ID NO:
 11. 15. An isolated polypeptide comprising: (i)an amino acid sequence that is at least 98% identical to SEQ ID NO: 9 asdetermined using the BESTFIT or GAP programs with a gap weight of 12 anda length weight of 4; (iii) an amino acid sequence encoded by thenucleic acid molecule of SEQ ID NO: 8, or a complement thereof; or (v)an amino acid sequence of SEQ ID NO: 9, or a fragment thereof.
 16. Anisolated polypeptide comprising: (i) an amino acid sequence that is atleast 70% identical to SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22, 24, 26,28, 30, 32, 34, or a fragment thereof as determined using the BESTFIT orGAP programs with a gap weight of 12 and a length weight of 4; (ii) anamino acid sequence encoded by the nucleic acid molecule of claim 8;(iii) an amino acid sequence of SEQ ID NOs: 7, 13, 15, 17, 19, 21, 22,24, 26, 28, 30, 32,
 34. 17. A fragment of a polypeptide comprising atleast 8 amino acid residues, wherein said fragment is a portion of thepolypeptide encoded by a nucleic acid molecule selected from the groupconsisting of exon I, exon II, exon III, exon IV and exon V of SEQ IDNO:
 1. 18. A polypeptide comprising the amino acid sequence of SEQ ID:3, 7, 9, 11, 13, 15, 17, 19, 21, 22, 24, 26, 28, 30, 32, 34 whichfurther comprises one or more conservative amino acid substitution. 19.A fusion polypeptide comprising the amino acid sequence of any one ofclaims 12-18 and a heterologous polypeptide.
 20. A fragment orimmunogenic fragment of a polypeptide of any one of claims 12-18,wherein the fragment comprises at least 5, 8, 10, 15, 20, 25, 30 or 35consecutive amino acids of the polypeptide.
 21. An antibody thatimmunospecifically binds to a polypeptide of any one of the claims12-18.
 22. A method for making a polypeptide of any one of the claims12-18, comprising the steps of: (a) culturing a cell comprising arecombinant polynucleotide encoding the polypeptide of any one of claims12-18 under conditions that allow said polypeptide to be expressed bysaid cell; and (b) recovering the expressed polypeptide.
 23. A complexcomprising a polypeptide encoded by a nucleic acid molecule of any ofclaims 1-9 and a starch molecule.
 24. The complex of claim 23, whereinthe starch molecule comprises from 1 to 700 glucose units.
 25. Thecomplex of claim 23, wherein the starch molecule comprises branchingchains of glucose polysaccharides.
 26. A vector comprising a nucleicacid molecule of any one of claims 1-9.
 27. An expression vectorcomprising a nucleic acid molecule of any one of claims 1-9 and at leastone regulatory region operably linked to the nucleic acid molecule. 28.The expression vector of claim 27, wherein the regulatory region conferschemically-inducible, dark-inducible, developmentally regulated,developmental-stage specific, wound-induced, environmentalfactor-regulated, organ-specific, cell-specific, and/or tissue-specificexpression of the nucleic acid molecule, or constitutive expression ofthe nucleic acid molecule.
 29. The expression vector of claim 27,wherein the regulatory region is selected from the group consisting of a35S CaMV promoter, a rice actin promoter, a patatin promoter, and a highmolecular weight glutenin gene of wheat.
 30. An expression vectorcomprising the antisense nucleotide sequence of a nucleic acid moleculeof any one of claims 1-9, wherein the antisense sequence is operablylinked to at least one regulatory region.
 31. A genetically-engineeredcell which comprises a nucleic acid molecule of any one of claims 1-9.32. A cell comprising the expression vector of claim
 27. 33. A cellcomprising the expression vector of claim
 30. 34. Agenetically-engineered plant comprising the nucleic acid molecule of anyof claims 1-9.
 35. The genetically-engineered plant of claim 34 andprogeny thereof, further comprising a transgene encoding an antisensenucleotide sequence.
 36. The genetically-engineered plant of claim 31,further comprising a RNA interference construct.
 37. A cell comprising a35S CaMV constitutive promoter operably linked to a nucleic acidmolecule of SEQ ID NO:2, or a rice actin promoter operably linked to aRNA interference construct comprising fragments of a nucleic acidmolecule of SEQ ID NO:2, wherein said rice actin promoter confersexpression of said fragments.
 38. A method of altering starch synthesisin a plant comprising, introducing into a plant an expression vector ofclaim 27, such that starch synthesis is altered relative to a plantwithout the expression vector.
 39. A method of altering starch synthesisin a plant comprising, introducing into a plant at least an expressionvector of claim 30, such that starch synthesis is altered in comparisonto a plant without the expression vector.
 40. A method of alteringstarch granules in a plant comprising, introducing into a plant at leastan expression vector of claim 27, such that the starch granules arealtered in comparison to a plant without the expression vector.
 41. Amethod of altering starch granules in a plant comprising, introducinginto a plant at least an expression vector of claim 30, such that thestarch granules are altered in comparison to a plant without theexpression vector.
 42. The method of claim 41, wherein starch granulesare absent from leaves of the plant comprising at least an expressionvector.
 43. A plant part comprising a nucleic acid molecule of any ofclaims 1-9, wherein starch synthesis is altered.
 44. The plant part ofclaim 43, wherein the part is a tuber, seed, or leaf.
 45. The modifiedstarch obtained from the plant part of claim 43, wherein themodification in the synthesized starch is selected from the groupconsisting of a ratio of amylose to amylopectin, amylose content, sizeof starch granules, quantity of size of starch granules, a ratio ofsmall to large starch granules, and Theological properties of the starchas measured using viscometric analysis.